FOREWORD 


Spatial  data  management  techniques  are  rapidly  becoming  practical  tools 
for  use  by  Corps  of  Engineers  field  offices  in  a variety  of  their  respon- 
sibilities. Various  aspects  of  these  techniques  have  been  applied  in 
traditional  survey  and  Phase  I General  Design  Memorandum  Studies  and  on 
a rather  grand  scale  in  the  Expanded  Flood  Plain  Information  studies  of 
the  Corps  Flood  Plain  Management  Services  program.  More  than  a dozen 
field  offices  have  made  use  of  the  concepts  and  techniques  of  spatial 
data  management  within  the  past  two  years.  A particularly  knotty  issue 
that  eventually  surfaces  in  all  studies  is  the  resolution  of  data  capture 
within  data  variables  and  between  data  variables  in  the  spatial  data  files. 

A seemingly  simple  idea  of  managing  different  data  variables  by  varying 
the  resolution  (in  effect  different  grid  sizes),  within  the  data  files  is 
fraught  with  conceptual  and  technical  pitfalls.  As  an  initial  step  in 
dealing  with  this  issue  a small  group  of  experienced  researchers  and  prac- 
titioners were  invited  to  convene  at  the  Hydrologic  Engineering  Center  (HEC) 
for  an  informal  seminar  to  explore  several  aspects  of  variable  resolution 
data  capture  in  spatial  data  management  techniques.  These  proceedings 
are  the  papers  and  discussions  presented  at  this  seminar,  where  the  par- 
ticipants explored  a range  of  alternative  accommodations  of  variable 
resolution  concerns  which,  to  a substantial  degree,  represent  documenta- 
tion of  the  present  state-of-the-art.  It  is  hoped  that  these  proceedings 
will  serve  as  a stimulus  to  motivate  interested  researchers  and  practitioners 
to  actively  renew  or  continue  efforts  to  practically  deal  with  this  par- 
ticularly interesting  and  difficult  technical  problem. 

The  papers  and  discussions  are,  in  general,  frank  statements  by  the  authors 
and  other  seminar  participants  and  are  not  to  be  construed  as  official 
Corps  documents.  The  views  and  comments  expressed  are  those  of  the  seminar 
participants,  and  are  not  intended  to  modify  or  replace  official  guidance 
or  directives  such  as  engineer  regulations,  manuals,  circulars,  or  techni- 
cal letters  issued  by  the  office  of  the  Chief  of  Engineers. 


toil  V 

a 

a 

>■•**»**— ‘-1 

1 tiKt!  i 

Oist  A- A'',  '"■i/v* 

£L 

i 


CONTENTS 


Page 

FOREWORD  ^ 

VARIABLE  GRID  RESOLUTION  - ISSUES 
AND  REQUIREMENTS 

Problem  Statment  Used  as  the  Basis 

for  Discussion  by  each  Seminar  Participant  1 


BACKGROUND  AND  SEMINAR  OBJECTIVES 
Darryl  W.  Davis 

Chief,  Planning  Analysis  Branch 

The  Hydrologic  Engineering  Center  6 


SPATIAL  DATA  MANAGEMENT  AND 
COMPREHENSIVE  ANALYSIS  SYSTEM  (HEC-SAM) 

Darryl  W.  Davis 

Paper  Reproduced  for  the  Seminar 
as  a Reference  to  HEC's  Current  Use  of 
Spatial  Data  Management  Technology  . . 


GEOGRAPHIC  DATA  HANDLING  ISSUES: 

AN  ALTERNATIVE  TO  VARIABLE  GRID  RESOLUTION 

Kenneth  J.  Dueker 

Director,  Institute  of  Urban  and  Regional  Research 
University  of  Iowa 

and 

Robert  H.  Ericksen 

Doctorate  Candidate,  Department  of  Geography 
University  of  Iowa 

and 

Evan  Noynaert 

Doctorate  Candidate,  Department  of  Geography, 
University  of  Iowa  


ii 


VARIABLE  GRID  RESOLUTION  - ISSUES  AND  REQUIREMENTS: 
THE  ADAPT  SOLUTION 


Page 


Richard  M.  Males 

Vice  President 

W.E.  Gates  A Associates 


VARIABLE  GRID  RESOLUTION 
Jack  Danqermond 

Director  ^ 

Environmental  Systems  Research  Institute 

and 

Raymond  Postma 
Proqranmer  Analyst 

Environmental  Systems  Research  Institute 
and 

William  Hodson 

Environmental  Scientist 

Environmental  Systems  Research  Institute  . 


SEMINAR  PARTICIPANTS 


I 


! 


II 

li 


53 


1 

' 

70 

116 

I 


J 


£ 


\ 


VARIABLE  GRID  RESOLUTION  - ISSUES  AND  REQUIREMENTS 


! The  Hydrologic  Engineering  Center  is  using  automated  geographic 

[ information  systems  to  manage  data  for  hydrologic,  economic  and  environ- 

I mental  analysis.  The  current  methodology  consists  of  using  grid  cell 

information  at  a resolution  of  0.25  - 1.53  acres,  stored  in  a computer 
data  bank  in  a multivariable  format,  i.e.,  all  the  information  pertaining 
to  a cell  is  stored  in  the  grid  cell  record.  The  grid  cell  resolution 
has  in  the  past  been  determined  by  the  desire  to  capture  terrain  variation 
to  enable  accurate  modeling  of  damage  and  erosion  processes.  Most  appli- 
cations to  date  have  grid  cell  dimensions  derived  from  the  area  that  a 
computer  printer  character  occupies  on  a USGS  7 1 /2-minute  quadrangle. 

A computer  character  which  measures  a 1/lOth  of  an  inch  across  in  the  X 

direction  (200  feet  on  the  ground)  and  a 1/6  or  an  1/8  inch  down  (333.33 

or  250  feet  on  the  ground),  depending  on  whether  the  printer  operates  at 
6 or  8 lines  per  inch.  A cell  which  measures  200  x 333.33  feet  equals 
1.53  acres  (8  lines  per  inch)  and  a cell  which  measures  200  x 250  feet 
equals  1.14  acres  (6  lines  per  inch).  This  rectangular  cell  size  is 
used  to  enable  the  use  of  printer  graphics  for  fast  and  inexpensive  scaled 
displays  to  aid  in  checking  data  and  output  of  modeling  results. 

The  data  variables  commonly  used  in  modeling  analysis  are: 

1.  Areal  data  - Watershed  and  subbasin  boundaries,  damage  reaches, 
land  use  patterns,  soil  associations,  environmental  habitats, 
city,  county,  and  other  political  boundaries,  and  development 
zoning 

2.  Continuous  data  - Topographic  elevation,  reference  flood  elevation 
(water  surface  elevation  for  a specific  flood  event),  depth  to 
ground  water  and  topographic  slope 

3.  Lineal  data  - Transportation  routes  and  stream  networks 

4.  Point  data  - Archeologic  and  historic  sites,  point  source  infor- 
mation  or  sampling  sites 

The  spatial  data  requirements  of  the  modeling  and  display  programs 

are: 

HYDPAR  - Watershed  identification  (watershed  and  subbasin  code),  land  use 
pattern  of  interest,  hydrologic  soil  group  and  land  surface  slope 

DAMCAL  - Damage  reach  identification,  land  use  pattern  of  interest,  topo- 
graphic elevation  and  reference  flood  elevation 


RIA 


-Discrete  variable  values  for  distance,  impact  and  attractiveness 
modeling  and  GRID  map  display. 


GRIDPLOT  -Plotter  display  of  data  variables  or  analysis  results  which  have 
been  preclassified  into  classes  which  have  a value  of  1 - 10, 

23  for  low  cells  and  25  for  high  cells. 

Current  studies  make  use  of  data  banks  constructed  with  a uniform 
grid  cell  size,  that  is  small  enough  to  capture  the  variation  in  the  most 
critical  variable,  which  is  usually  topographic  elevation  for  use  in  damage 
calculations.  This  resolution  is  required  for  grid  cells  within  damage 
reaches  (the  flood  plain),  but  by  using  a uniform  cell  size  throughout  the 
entire  study  area,  many  thousands  of  extra  grid  cells  are  processed  by 
modeling  programs  because  the  information  is  relatively  over-defined  in 
relation  to  the  modeling  requirements.  Figure  1,  Flood  Plain  Resolution, 
shows  this  difference  in  data  requirements. 

Another  demonstration  of  the  same  type  of  problem  is  shown  in  Figure  2, 
Urban  vs.  Rural  Resolution,  where  areas  inside  a city  limit  (or  potential 
urban  area)  may  need  very  detailed  information,  with  less  detail  in  the 
surrounding  areas  being  more  than  adequate. 


DISCUSSION  POINTS 


Given  the  variable  grid  resolution  issues  discussed  above,  how  should 
the  spatial  data  be  stored  and  retrieved? 

1.  Create  a data  storage  structure  which  allows  the  HYDPAR,  DAMCAL, 
GRIDPLOT  and  RIA  programs  to  retrieve  the  spatial  data  they 
require, 

and  2.  Write  a short  10  - 15  page  report  on  your  solution  to  the  problem. 

The  paper  will  be  formally  presented  at  a seminar  attended  by  other 
participants  asked  identical  questions.  The  paper  should  conform  to  the 
following  requirements  and  outline  to  permit  timely  publication  of  seminar 
proceedings. 


2 


OUTLINE  OF  SOLUTION  PAPER 


I  Introduction  to  sui.iuested  philosophy  of  handling  spatial 
data. 

II  The  data  structure  proposed  for  the  storage  of  the  spatial 
data. 

Ill  The  data  flow  of  infornation  from  the  proposed  data  structure 
to  each  of  the  HYDPAR,  OAMCAL,  GRIOPLOT  and  RIA  programs. 


IV  How  the  data  structure  would  conform  to  computer  graphics 
that  are  based  on  a single  grid  cell  resolution  (GRID  and 
GRIDPLOT). 

V  Advantages  and  disadvantages  of  proposed  technique  vs.  a 
single  unifonn  grid  resolution  throughout  the  entire  study 
area. 

VI  Conclusions  and  other  observations,  prognostications. 

VII  References 


MANUSCRIPT  SPECIFICATIONS 

1 Each  paper  will  be  typed  in  final  form  on  8 \/2  x 11  paper 
and  will  conform  to  the  standard  MFC  seminar  paper  require- 
ments. 

II  All  figures  and  tables  should  be  scaled  so  that  they  may 
be  reduced  to  8 1/2  x 11  pages,  and  still  be  readable. 


6 


1 


BACKGROUND  AND  SEMINAR  OBJECTIVES 
by 

Darryl  W.  Davis ^ 


BACKGROUND 


The  Corps  of  Engineers  began  exploring  the  application  of  spatial  data 
management  techniques  to  planning  studies  in  the  late  1960's  with  the 
publication  of  a contract  research  document  "A  Comparative  Study  of  Re- 
source Analysis  Methods"  (l)i./  prepared  by  the  Department  of  Landscape 
Architecture,  Harvard  University,  a pioneer  in  the  development  of  methods 
of  geographic  data  management.  The  Corps  subsequently  requested  the  same 
group  to  prepare  a strategy  for  using  these  methods  in  planning  recreation 
facilities  around  the  periphery  of  Corps  reservoirs.  The  research  report 
"Honey  Hill:  A Systems  Analysis  for  Planning  the  Multiple  Use  of  Control- 
led Water  Areas"  (2)  documented  the  strategy  and  illustrated  its  use  on  a 
small  reservoir  in  the  New  England  area.  The  Corps  Institute  for  Water 
Resources,  (IWR),  the  research  manager  for  the  investigation  at  that  time, 
sponsored  a field  trial  of  the  grid  based  spatial  data  management  method- 
ology in  the  Santa  Ana  river  basin,  near  Los  Angeles,  California.  The 
Santa  Ana  basin  is  some  10,000  sq.  mi.  in  size  and  the  planning  study  was 
comprehensive  in  scope.  A long  list  of  complications,  subsequently  docu- 
mented in  "The  Santa  Ana  River  Basin,  An  Example  of  the  Use  of  Computer 
Graphics  in  Regional  Plan  Evaluation"  (3),  resulted  in  the  trial  being  less 
than  a complete  success.  The  IWR  recognized  that  a major  shortcoming  of 
the  field  trial  was  the  lack  of  access  by  field  staff  to  readily  available 
Corps  expertise  in  the  area  of  data  management  technology.  To  overcome 
this  basic  problem  the  HEC  was  invited  to  investigate  the  concepts  and  meth- 
odology of  the  Honey  Hill  work  and  consider  assuming  the  responsibility  for 
its  implementation  in  Corps  studies,  if  appropriate;  the  thinking  beinn  that 
HEC  was  experienced  in  the  development  and  management  of  computerized  tech- 
nology and  was  experienced  in  providing  field  assistance. 


Chief,  Planning  Analysis  Branch,  The  Hydrologic  Engineerino  Center, 
Davis,  California. 

-^References  are  listed  at  the  end  of  the  paper. 


A 


6 


HEC  investiqated  the  methodology,  particularly  the  data  management  aspects 
and  It  seemed  that  there  was  great  potential  for  useful  application  in 
many  of  HEC's  areas  of  interest.  The  program  source  code  for  creating  and 
managing  the  grid  data  banks  and  performing  certain  analysis  and  output 
displays  was  requested  from  Harvard,  the  L.A,  District,  and  IWR.  The  material 
obtained  was  simply  unuseable.  HEC  received  source  code  listings  (no  decks) 
that  were  printed  in  the  wrong  character  set,  no  documentation,  and  no  offer 
of  aid  to  bring  the  software  up  on  HEC  computers.  The  point  of  this  back- 
ground on  the  source  code  is  that  HEC  by  necessity  started  from  scratch  to 
acquire  and/or  create  the  computer  code  that  now  represents  the  currently 
existing  HEC  spatial  data  management  capability.  From  this  experience,  HEC 
has  developed  considerable  knowledge  of  the  concepts  and  it  has  also  become 
intimately  familiar  with  the  software/ hardware  aspects  of  spatial  data 
management  technology. 

At  about  this  same  time  HEC  was  assisting  the  St.  Louis  District  in  formu- 
lation of  alternatives  for  an  authorized  interior  flood  control  proiect.  A 
mutual  interest  in  applying  the  grid  data  management  methods  to  the  facility 
siting  tasks  of  plan  formulation  resulted  in  HEC  developing  the  Locational 
Attractiveness  capability  of  the  later  more  comprehensive  Resource  Infor- 
mation and  Analysis  (RIA)  program.  This  capability  was  patterned  after  the 
basic  Harvard  work  (2)  and  was  applied  for  this  investigation.  A small  grid 
cell  data  bank  was  created  covering  the  study  area  with  about  a five  acre 
resolution  grid.  The  ideas  for  using  the  grid  data  base  to  perform  quanti- 
tative computations  for  hydrologic  and  damage  analysis  were  developed  at 
this  time.  The  grid  size  seemed  satisfactory  for  the  hydrologic  computations 
but  was  clearly  unsatisfactory  to  capture  the  terrain  variation  (topography) 
accurately  enough  to  permit  credible  damage  calculations  to  be  performed. 

The  creation  of  a smaller  resolution  (about  1.5  acre  grids)  data  bank  was 
initiated  but  to  this  day  (late  1977)  has  yet  to  be  completed.  Shifting  of 
priorities  and  personnel  within  the  St.  Louis  District  has  slowed  progress 
to  a near  halt.  A more  dynamic,  enthusiastic  project  setting  was  needed 
and  it  appeared  as  if  one  had  been  nrecisely  scheduled. 

The  Savannah  District  consented  with  the  Corps  headquarters  to  perform  a 
pilot  study  that  sought  to  establish  a services  posture  that  would  be  a long 
term  conmitment  for  advice  and  analytical  assistance  by  the  Crops  to  local 
communities  in  decisions  and  actions  related  to  the  floodplain.  The  scope 
of  services  was  to  be  comprehensive  and  continuous,  i.e.  available  on 
request  for  special  assessments.  HEC  was  asked  to  provide  advice  on  the 
technology  aspects  of  providing  the  community  services  planned  for  the 
pilot  study.  The  general  concept  of  integrated  interactive  use  of  a compre- 
hensive gridded  geographic  and  resource  data  bank  was  adopted  and  develop- 
mental efforts  were  begun  in  earnest  by  HEC.  The  use  of  grid  data  was 
determined  to  be  the  only  spatial  data  management  technique  that  offered 
significant  analytical  opportunities  when  compared  to  polygon  oriented 
approaches.  The  results  of  the  basic  research  and  development  efforts  and 


J 


7 


a test  application  are  documented  in  "Phase  I Oconee  Basin  Pilot  Study, 

Trail  Creek  Test"  (4).  The  inteqrated  data  management  and  analysis  system 
is  specifically  described  in  detail  in  "Spatial  Data  Management  and  Comprp- 
hensWe  Analysis  System  (HCC-SAM)"  (5)  which  is  reproduced  in  these  proceed- 
ings for  reference  purposes.  Due  to  the  demonstrated  analytical  power  of 
spatial  data  management  techniques,  other  applications  of  the  technology 
are  being  implemented  in  other  Corps  studies,  such  as  the  evaluation  of 
structural  and  nonstructural  flood  damage  reduction  measures  (f)  and  the 
evaluation  of  dredge  disposal  activities  in  the  San  Francisco  Ray  (7). 

It  should  be  noted  at  this  point  that  the  use  of  gridded  data  banks  by  HEC 
and  users  of  HEC  developed  technology  is  substantially  different  from  the 
use  that  is  made  by  geographers  and  users  of  the  Harvard  type  landscape 
analysis.  The  focus  by  HEC  has  been  on  an  output  product  which  is  a step 
or  at  times  two  steps  beyond  the  data  bank.  In  other  words  the  data  bank 
is  a means  to  an  end,  not  an  end  in  itself.  The  product  sought  is  generally 
quantative  engineering  type  analysis,  rather  than  graphic  displays  and 
simple  statistics  as  has  been  the  more  common  historical  use  of  gridded 
data  banks.  The  contents  of  the  data  bank  are  used  to  calculate  hard  para- 
meters for  detailed  hydrologic  and  water  quality  simulation  models,  accur- 
ately compute  damages,  as  well  as  provide  the  more  traditional  geographic 
analysis. 

Spatial  data  management  methods  have  been  applied  in  one  completed  Exeanded 
Flood  Plain  Information  (XFPI)  study  (8)  and  they  are  being  employed  in 
another  10  XFPI  studies  currently  underway.  HEC  is  assisting  all  studies 
but  one.  The  study  areas  range  in  size  from  15  to  800  square  miles  and 
the  resolution  of  data  capture  varies  from  about  one  quarter  acre  to  4.5 
acres.  The  odd  sizes  are  because  the  cells  are  rectangular  to  permit  line 
printer  output  to  be  undistorted.  The  grid  cell  data  banks  being  created 
range  from  15  variables  to  30  variables.  The  Corps  Waterways  Experiment 
Station  (WES)  is  performing  the  one  remaining  XFPI  study  which  encompasses 
the  largest  area,  some  800  square  miles.  The  WES  has  had  considerable  ex- 
perience over  the  years  in  creating  and  using  gridded  terrain  (topography) 
data  files  in  connection  with  military  mobility  research. 

The  grid  cell  size  to  be  adopted  for  data  capture  eventually  emerged  in  all 
the  studies  as  a significant  issue.  On  the  one  hand,  nreat  fidelity  is 
required  for  certain  analysis  such  as  topography  for  erosion  analysis,  and 
topography  and  land  use  for  flood  damage  calculations  while  on  the  other  hand, 
considerable  generalization  is  appropriate  for  some  other  data  variables  and 
analysis  such  as  land  use  for  hydrologic  computations.  The  approach  cur- 
rently taken  by  HEC  is  to  capture  all  data  at  the  resolution  needed  for  the 
most  critical  variable  since  there  does  not  presently  exist  technology  that 
will  easily  manage  varying  grid  resolution  within  a grid  cell  data  file. 


8 


Even  though  all  HEC  assisted  studies  have  adopted  a single  unifonn  grid 
for  all  variables,  there  continues  to  be  a feeling  among  field  offices 
performing  the  studies  that  there  must  be  a better  alternative  than  man- 
aging all  of  the  data  at  the  critical  resolution  of  a single  data  variable. 
The  WES  study  adopted  a two  data  bank  approach  - grids  of  about  10  acres 
for  all  variables  off  the  flood  plain  and  grids  of  about  one  fourth  acre 
for  selected  damage  related  variables  in  the  flood  plain.  Unfortunately, 
representatives  of  WES  declined  to  attend  this  seminar  and  represent  that 
particular  (two  data  bank)  approach  to  variable  grid  resolution. 


OBJECTIVES 


The  objectives  of  the  seminar  convened  are  as  follows: 

* Sharpen  our  collective  perceptions  of  the  significant 
issued  related  to  data  capture  resolution. 

* Define  the  existing  state-of-the-art  in  spatial  data  management. 

* Define  the  issue  and  suggest  solutions  for  resolution  variation 
between  data  variables. 

* Define  the  resolution  issue  and  suggest  solutions  with  respect 
to  the  geographic  scope  and  cell  size  encompassed  by  the  data 
bank,  e.g.  what  are  the  limits  and  bounds  on  the  mass  amount 
of  data  that  can  be  sensibly  handled. 

* Foster  a collective  sense  of  camaraderie  among  the  assembled 
professionals  to  encourage  sharing  of  ideas  for  the  advancement 
of  spatial  data  management  technology. 


1.  Stelnitz  et.al.  A Comparative  Study  of  Resource  Analysis  Methods, 
Department  of  Landscape  Architecture  Research  Office,  Graduate  School 
of  Design,  Harvard  University,  July  1969. 


2.  Honey  Hill:  A Systems  Analysis  for  Planning  the  Multiple  Use  of 
Controlled  Water  Areas,  IWR  Report  71-9,  October  1971. 


3.  The  Santa  Ana  Basin:  An  Example  of  the  Use  of  Computer  Graphics  In 
'.egional  Plan  Evaluation,  IWR  Contract  Report  75-3,  June  1975. 


4.  Phase  I Oconee  Basin  Pilot  Study,  Trail  Creek  Test,  The  Hydrologic 
Engineering  Center,  September  1975. 


5.  Davis,  D.W.,  Spatial  Data  Management  and  Comprehensive  Analysis 
System  (HEC-SAM),  21  October  1976. 


6.  Webb,  R.P.  and  Burnham,  M.W.,  Spatial  Data  Analysis  of  Nonstructural 
Measures,  Technical  Paper  No.  46,  The  Hydrologic  Engineering  Center 
U.S.  Army  Corps  of  Engineers,  Davis,  California,  1976. 


7.  Dredge  Disposal  Study,  San  Francisco  Bay  and  Estuary,  U.S.  Angy  Corps 
of  Engineers,  San  Francisco  District,  September  1977. 


8.  Expanded  Flood  Plain  Information  Study,  Upper  Oconee  River  Basin,  Ga., 
General  Report,  U.S.  Army  Engineer  District,  Savannah,  May  1977. 


10 


L..  ^ J 


'1 


i 


0.  W.  DAVIS 
21  October  197’6 


1.  Spatial  Data  Management  and  Comprehensive  Analysis  System  (HEC-SAM) 

2.  Purpose  of  HEC-SAM 


HEC-SAM  was  created  to  provide  an  analytical  tool  and  analysis  structure 
that  would  permit  Corps  of  Engineers  District  offices  to  provide  comprehensive 
planning  assistance  to  local  governmental  units  In  decisions  related  to  flood 
plain  management  (1)?  The  specific  technical  purpose  was  to  provide  the  cap- 
ability to  assess  hydrologic,  flood  damage,  and  environmental  consequences  of 
development  situations  that  are  reflected  by  alternative  land  use  patterns  and 
water  management  works.  The  planning  environment  which  the  system  Is  designed 
to  service  Is  urban  areas  where  development  pressures  are  either  currently 
significant  or  expected  to  be  significant  in  the  near  future  and  where  there 
exists  a strong  desire  on  the  part  of  the  local  planning  agencies  to  manage 
development  in  the  best  Interests  of  the  community,  giving  balanced  consider- 
ations to  flood  hazard  consequences  of  off  flood  plain  and  on  flood  plain 
development. 


The  general  analytical  strategy  that  comprises  HEC-SAM  Is  to  1)  assemble 
and  catalogue  basic  geographic  and  resource  Information  into  a computer  data 
bank  2)  cooperatively,  with  local  agencies,  forecast  and  place  Into  the  data 
bank  selected  alternative  future  development  patterns  3)  perform  comprehensive 
assessments  of  the  selected  alternative  futures  and  4)  document  the  assessment  for 
study  by  the  general  public  and  community  officials.  Subsequent  assessment 
services  would  be  provided  on  a continuing  basis  at  local  agency  request. 

Specific  development  proposals  would  be  assessed,  land  use  development  policies 
analyzed  and  Informed  technical  guidance  provided  by  the  Corps  to  the  local 
officials. 


The  system  Is  presently  emerging  from  the  pilot  study  stage  and  Is  being 
applied  In  several  studies  of  the  type  for  which  It  was  created.  It  has  also 
proven  sufficiently  attractive  and  powerful  that  managers  of  some  traditional 
Corps  Survey  Investigations  plan  to  make  use  of  major  portions  of  the  technology 
In  their  studies. 


3.  System  Characteristics 


a.  Software:  The  HEC-SAM  system  Is  comprised  of  a family  of  data 


management  and  analysis  computer  programs  that  service  the  full  range  of  com- 
prehensive assessments.  Figure  1 presents  a functional  flow  diagram  of  the 
analysis  process  and  Input  and  output  results.  About  1/3  of  the  links  shown 
on  the  diagram  for  the  Interface  and  Analysis  programs  are  presently  automated 
and  these  links  are  Intended  to  be  highly  automated  in  the  near  future. 

Table  1 lists  and  briefly  describes  the  computer  software  Indicated  on  the 
general  flow  diagram. 


References  are  listed  at  the  end  of  the  paper. 


11 


The  system  has  three  distinct  functional  elements:  Data  Bank  Mananement, 

Data  Bank  Processinq  Interface,  and  Comprehensive  Analysis.  The  data  bank 
management  element  is  comprised  of  the  subfamily  of  computer  programs  required 
to  process  raw  map  or  other  type  data  to  the  grid  cell  format  that  becomes  the 
general  data  bank.  This  includes  a program  that  permits  displaying  data  dig- 
itized In  the  grid  cell  format  (fiRID)  (2),  a program  that  displays  data  digitized 
In  the  polygon  format  and  generates  grid  cell  data  from  polygon  format  data 
(AUTOMAP  II)  (3),  special  purpose  programs  to  create  grid  topographic  data  from 
digitized  contour  lines  (TOPO,  LINES),  and  programs  to  properly  register  polygon 
data  to  the  base  grid  system  (REGISTER)  and  place  the  grid  data  Into  the  general 
data  bank.  (BANK) 

The  Data  Bank  Processing  Interface  element  Is  comprised  of  computer  programs 
that  compile  and  reformat  geographic  and  resource  data  retrieved  from  the  data 
tenk  into  a form  processable  by  the  general  analysis  computer  programs.  The 
programs  service  the  functional  analysis  areas  of  Flood  Hazard,  Flood  Damage  and 
Environmental  Status.  HYDPAR  links  the  data  bank  to  the  flood  hazard  analysis 
by  retrieving  the  data  variables  of  hydrologic  subbasins,  slope,  soil  group  and 
land  use  to  generate  the  modeling  parameters  required  to  simulate  storm  runoff. 

The  link  between  the  generated  modeling  parameters  and  the  simulation  analysis 
program  (HEC-1)  is  currently  systematic  but  not  automated.  HYDPAR  also  provides 
links  to  the  environmental  analysis  by  retrieying  land  use,  soil  and  subbasin 
data  from  the  data  bank  and  generating  modeling  parameters  required  to  simulate 
the  quality  of  urban  storm  runoff  and  land  surface  erosion.  The  link  between 
the  data  bank  through  HYDPAR  to  the  analysis  program  STORM  Is  completely  auto- 
mated, with  the  subsequent  link  to  the  general  dynamic  water  quality  simulation 
not  presently  automated.  OAMCAL  links  the  data  bank  to  the  flood  damage  analysis 
by  retrieving  the  data  variables  of  damage  reach,  land  use,  topography  and 
reference  flood  to  generate  elevation-damage  tables  by  land  use  category  and 
damage  index  location  for  subsequent  integrated  analysis.  ATODTA  also  serves 
the  flood  damage  analysis  by  restructuring  the  DAMCAL  generated  data,  interfacing 
It  with  hydraulic  and  hydrologic  probability  data  and  providing  an  automated 
link  to  the  general  hydrologic  and  damage  analysis  program  HEC-1.  The  linkage 
from  the  data  bank  through  DAMCAL  and  ATODTA  to  the  analysis  program  is  completely 
automated. 

The  Comprehensive  Analysis  element  Is  comprised  of  the  general 
simulation  and  analysis  computer  programs  that  perform  the  end  product  detailed 
technical  assessments  that  compare  the  existing  condition  to  the  development 
condition  of  Interest.  In  most  instances  the  final  analysis  computer  programs 
are  standard  Corps  of  Engineers  analytical  tools  that  have  been  In  use  a number 
of  years  and  are  thus  familiar  to  potential  Corps  users.  Some  programs  have 
been  slightly  modified  to  interface  with  data  being  generated  from  the  data 
bank  rather  than  In  their  usual  formats.  In  a few  Instances  basic  modifications 
were  made  to  the  programs  to  permit  or  encourage  a more  systematic  analysis 
process  (than  traditional)  to  take  advantage  of  the  opportunities  offered  by 
ready  access  to  a comprehensive  data  bank.  HEC-2  (4)  has  served  the  Corps  many 
years  In  performing  river  hydraulic  analysis  and  Is  used  In  its  traditional  form. 
HEC-1  (4)  serves  the  double  duty  of  general  hydrologic  simulation  to  forecast 


12 


SPATIAL  DATA  MANAGEMENT 
AND  COMPREHENSIVE  ANALYSIS  SYSTEM 


TABLE  1 

HEC-SAM  SOFTWARE  SUMMARY 


Data  Bank  Management  Programs 

Title 

Source/Avai labi 1 1 ty 

Description 

GRID 

Harvard  (2)  / HEC 

Prints  grey  shade  overprint  maps  of 
grid  data. 

AUTOMAP  II 

ESRI  (3)  / HEC 

Prints  grey  shade  overprint  maps  and 
generates  grid  data  from  polygon  data 

REGISTER 

HEC  (1)  (New)** 

Registers  polygon  data  sets  to  base 
map  coordinates. 

TOPO/INTPL 

HEC  (1)  (New) 

Generates  grid  topographic  data  from 
contour  data. 

BANK 

HEC  (1)  (New) 

Manages  files  comprising  grid  cell 
data  bank. 

Data  Bank  Processing  Interface 

HVOPAR 

HEC  (1)  (New) 

Generates  hydrologic,  storm  quality 
and  erosion  modeling  parameters  from 
data  bank. 

OAMCAL 

HEC  (1)  (New) 

Generates  elevation-damage  files 
from  data  bank. 

ATODTA 

HEC  (1)  (New) 

Coordinates  and  manages  economic, 
hydraulic  and  hydrologic  data  for 
modeling. 

ROUTE 

HEC  (1)  (New) 

Generates  hydrologic  modeling  data 
from  stream  geometry  files. 

Comprehensive  Analysis 

HEC-1 

HEC  (Modified)  (1) 

Generalized  hydrologic  and  flood 
damage  analysis  model. 

HEC-2 

HEC  (1) 

Generalized  river  hydraulic  model 
converts  flow  to  elevation. 

STORM 

HEC  (1) 

Generalized  urban  storm  water  quality 
and  surface  erosion  model . 

WQRRS 

HEC  (1) 

Generalized  stream  water  quality 
simulation  model . 

RIA 

Harvard/HEC  (New) 

Spatial  analysis  package  for 
attractiveness  and  Impact  analysis 
based  on  work  done  by  Harvard. 

Program  reference  documents 

**Programs  labeled  (New)  were  developed  specifically  for  the  pilot  study  and 
have  not  been  generally  released. 


U 


the  hydrologic  effects  of  development  proposals  and  also  integration  of  the 
hydrologic  with  economic  damage  data  to  provide  the  assessment  of  the  expected 
value  of  annual  damages  (average  annual  damages)  resulting  from  development 
alternatives.  The  RIA  program  operates  by  direct  link  to  the  data  bank  and 
performs  coincident,  attractiveness  and  vulnerability  analysis  and  general  grid 
mapping.  The  program  is  adapted  from  work  by  Harvard  (5)  and  makes  use  of  a 
modified  version  of  the  general  grid  plot  program  GRID  (2).  STORM  and  MQRRS 
(4)  are  recently  developed  Hydrologic  Engineering  Center  computer  programs  that 
forecast  urban  stormwater  quality  and  dynamic  instream  water  quality  simulations 
of  waste  loadings  from  treatment  plants  and  urban  storm  runoff. 

b.  Hardware:  The  HEC-SAM  system  has  been  developed  to  operate  on  major 
computer  systems.  The  system  used  most  extensively  for  original  program 
development  work  was  the  CDC  7600  installation  at  the  Lawrence  Berkeley 
Laboratories  of  the  U.  S.  Nuclear  Regulatory  Commission  (Berkeley,  California). 
The  programs  are  written  in  ANSI  Standard  FORTRAN  IV  and  are  thus  basicallv 
nortable  between  maior  computer  svstems.  No  transfer  of  orograms  has  vet 

been  made,  however.  Major  systems  with  64,000  words  of  core  storage 

and  4 peripheral  storage  devices  (tapes,  discs,  etc.)  and  standard  line  printer 
output  can  accommodate  all  system  programs.  The  Data  Bank  Management  and  Data 
Bank  Processing  Interface  programs  do  not  require  the  storage  and  computer  speed 
of  the  major  programs  and  thus  could  be  onerated  on  smaller  perhaps  even  mini 
computer  systems.  The  comprehensive  analysis  programs  require  the  core  sire 
and  execution  speed  of  major  computer  systems  to  be  used  efficiently  and 
effectively. 

c.  Input,  Analysis  and  Output:  The  system  envisions  that  the  data  normally 
used  during  comprehensive  planning  studies  would  be  encoded  and  processed  onto 

a computer  storage  device  (such  as  tape  or  disc)  by  application  of  the  various 
Data  Bank  Management  programs.  The  specific  programs  used  would  depend  upon  the 
form  in  which  the  digitized  data  is  received;  either  point,  grid,  contour  or 
polygon,  the  form  being  dependent  upon  the  nature  of  the  variable  and  the  rel- 
ative advantages  and  disadvantages  of  alternative  encoding  methods.  The  initial 
Input  data  are  the  basic  resource  naps  that  are  encoded  and  placed  into  the 
data  bank.  Analysis  would  be  performed  for  a selected  development  condition 
(alternative  future,  e.g.,  a projected  land  use  pattern  with  a certain  flood 
hazard  zoning  policy)  by  processing  the  development  into  the  data  bank  as  a new 
variable  and  successively  executing  the  Interface  and  Comprehensive  Analysis 
programs.  The  specific  executions  that  are  performed  would  be  dependent  upon 
the  specific  nature  of  the  alternative  future  that  is  assessed. 

The  comprehensive  assessments  require  specific  input  data  such  as  the 
hydrologic  structure  of  the  area,  stream  geometry,  calibrated  storms,  relation- 
ships between  land  use  and  runoff,  damage  potential,  storm  pollutant  washoff 
etc.  The  initial  modeling  calibration  data  is  prepared  conventionally  based 
on  observed  data  supplemented  bv  parameters  generated  from  the  data  bank  and 
then  the  calibration  data  is  used  as  the  mechanism  for  forecasting  the  change 
in  modeling  data  that  would  be  caused  by  development  alternatives. 


15 


The  system  output  Includes  1)  grid  map  graphic  displays  of  the  data 
variables,  attractiveness,  and  impact  analysis  results  and  2)  detailed 
numeric  printout  of  runoff  hydrographs,  flow  exceedance  frequency  relation- 
ships, expected  annual  damages,  storm  pollutographs  and  time  traces  of 
erosion  and  a range  of  water  quality  parameters  for  existing  and  the  selected 
alternative  future  development  patterns.  The  output  corresponds  to  the  complete 
range  of  technical  output  of  comprehensive  flood  plain  assessments. 


d.  Resolution  and  Accuracy:  A major  purpose  in  creation  of  HEC-SAM 
was  to  cause  consistent,  systematic  analysis  of  future  development  to  be 
performed  in  traditional  functional  areas  and  with  a coimion  data  set.  The 
level  of  detail  and  accuracy  of  final  analysis  was  to  be  consistent  with 
traditional  methods,  i.e.,  not  permit  loss  of  detail.  For  hydrologic  com- 
putations, rather  coarse  grid  sizes  (4  to  10  hectares)  and  relatively  few 
categories  of  major  variables  (for  example  4 to  5 land  use  classes)  are 
considered  sufficient.  General  environmental  analysis  does  not  seem  to  be 
more  greatly  demanding  in  detail  than  required  for  hydrologic  analysis. 

Flood  damage  calculations,  on  the  other  hand,  require  a rather  accurate 
terrain  resolution  within  the  flood  plain  and  land  use  category  subdivision 
be  employed.  Depending  upon  topographic  variation,  grid  cells  as  small  as  1/4 
hectare  are  necessary  whereas  in  more  gentle  terrain,  as  large  as  2 hectares 
could  be  acceptable.  In  any  event,  it  appears  the  terrain  variation  and  sub- 
sequent detail  required  for  flood  damage  analysis  dictates  the  appropriate 
grid  cell  size.  The  present  state  of  HEC-SAM  does  not  permit  variable  grid 
cell  size  being  stored  in  the  data  bank  so  that  the  terrain  of  the  study  area 
dictates  the  size  of  the  grid  cells  for  all  data  variables. 


4.  Analysis  Capabilities 


The  general  capability  of  HEC-SAM  is  to  provide  a comprehensive  systematic, 
assessment  of  alternative  development  patterns  in  the  functional  areas  of  flood 
hazard,  flood  damage  and  environmental  status.  A listing  of  the  more  commonly 
used  capabilities  in  each  of  these  areas  would  include: 


a.  Flood  Hazard:  HEC-SAM  will  evaluate  the  following  prespecified 
alternatives  for  a specific  storm  event  (such  as  the  100-year  interval  event) 
or  for  a range  of  stomi  events  (development  of  flow  and/or  elevation  exceedance 
frequency  relationship)  at  all  selected  important  locations  within  a study  area. 


Changed  land  use  patterns 
Changed  drainage  system 
Flood  plain  occupancy  encroachments 
On-site  water  management  strategies 

Engineering  works  of  levees,  channel  modifications,  reservoir 
storage  and  flow  rerouting 
Watershed  management  practices 


16 


J 


I 


b.  Flood  Dainaqo:  HEC-SAM  will  evaluate  the  dollar  damages  for  a specific 
event  (such  as  the  lOO-year  exceedance  interval  event)  and  the  expected  value 
of  annual  damages  (average  annual  damages)  for  each  designated  location  In  the 
study  area  and  each  desired  damage  category  (residential,  commercial,  etc.)  for 
the  following: 

. Changed  flood  plain  occupancy 

. Changed  watershed  runoff  such  as  from  changed  land  use 
. Changed  stream  conveyance  such  as  from  floodplain  encroachment 
. Changed  structural  construction  practices 
. Alternative  development  control  policies 
. Changed  value  of  flood  plain  structures 

. ModlHed  structure  damage  potential  such  as  from  flood  proofing 
. Effects  of  engineering  flood  control  and  drainage  works  of  levees, 
channels,  reservoirs,  and  diversions 

c.  Environmental : HEC-SAM  will  perform  a variety  of  environmental 
evaluations  for  the  alternatives  and  conditions  described  In  Flood  Hazard  and 
Flood  Damage  above.  The  evaluations  that  can  be  performed  are: 

. Catalogue  environmental  habitat  changes  from  changed  land  use 
(coincident  analysis) 

. Forecast  changes  In  land  surface  erosion  and  transport  for  land  use 
and  engineering  work  changes 

. Forecast  changes  In  runoff  quality  from  changed  land  use 
. Forecast  changes  in  stream  water  quality 

. Develop  first  order  attractiveness  and  Impact  spatial  displays 
. Identify  enriched  habitat  zones  by  econ-tone  analysis 

5.  Applications 

HEC-SAM  was  created  as  a part  of  a pilot  study  entitled  "Expanded  Flood 
Plain  Information  Study  for  the  Upper  Oconee  River  Basin."  The  study  area  In- 
cludes Athens,  Georgia  and  was  undertaken  by  the  Savannah  District,  Corps  of 
Engineers  with  analytical  assistance  by  The  Hydrologic  Enqineerino  Center. 

A test  area  of  the  pilot  study  area  was  selected  and  full  scale  analysis  performed 
and  documented  In  (1).  The  pilot  study  itself  has  been  completed  and  publication 
of  the  Initial  assessments  of  four  alternative  future  land  use  patterns  is  ex- 
pected In  late  1976.  Based  on  findings  and  encouraging  results  from  this  initial 
pilot  study,  three  others  have  been  initiated  that  are  at  various  stages  of 
completion  ranging  from  just  beginning  to  50  percent  complete.  Six  to  eight 
new  studies  are  progranmed  for  Initiation  during  FY  77. 

The  results  of  HEC-SAM  applications  are  generated  In  map,  tabular,  and 
graphic  format,  much  of  which  is  complex  and  detailed  and  thus  requires  an 
experienced  professional  to  interpret  and  display.  This  is  especially  true 
in  the  detailed  water  quality  assessments.  Selected  published  test  results  (1) 
have  been  extracted  and  are  presented  herein  to  Illustrate  the  nature  of  the 
outputs. 


The  test  area  for  which  results  are  shown  Is  the  Trail  Creek  watershed 
which  occupies  about  12  mi.^  of  the  pilot  study  area  of  300  mi.^  and  includes 
a portion  of  the  city  of  Athens,  Georqia.  The  test  area  is  presently  about  10 
percent  urban  and  expected  to  orow  to  20  to  30  percent  urban  by  1900.  The  data 
bank  created  for  Trail  Creek  included  the  15  data  variables  shown  in  Figure  1 
at  a grid  size  of  approximately  0.6  hectares. 

a.  Flood  Hazard:  Table  2 summarizes  the  results  of  evaluating  the 
alternative  conditions  indicated.  Note  that  the  flow  rate  increases  for  each 
of  the  specified  probabilities  but  at  a less  proportionate  rate  for  rarer  events. 
Note  also  that  the  flow  rate  change  for  say  the  100-year  event  is  different 
between  control  points  and  that  the  change  in  flood  elevation  is  not  directly 
proportional  to  the  change  in  flow.  Study  of  the  table  indicates  that  the 
hydrologic  consequences  of  land  use  and  engineering  v/orks  are  comolex  and  require 
careful  analysis. 


TABLE  2 

HYDROLOGIC  DATA  SUMMARY 
TRAIL  CREEK  TEST 


100  YEAR  PEAK  FLOW  AND  ELEVATION 


Index 

Existing 

Land  Use 

1990  Land 

Use 

Station 

Flow  (cfs) 

Elevation 

Flow  (cfs) 

Elevation 

1 

7600 

627.1 

9400 

628.3 

2 

3450 

656.4 

3800 

656,7 

3 

2600 

711.9 

2900 

712.2 

4 

3900 

650.3 

5100 

651.2 

5 

1600 

694.2 

1650 

694.3 

FLOW  - 

EXCEEDANCE  INTERVAL  DATA 

(cfs) 

Index 

Station 

1 

2 

3 

4 

5 

Exceedance 
Interval  (yr) 

Exist 

1900 

Exist 

1990 

Exist 

1990 

Exist 

1990 

Exist 

1990 

5 

2000 

2800 

950 

1200 

800 

960 

1100 

1700 

500 

570 

10 

3000 

3900 

1350 

1650 

1100 

1300 

1600 

2300 

700 

780 

25 

4400 

5600 

2000 

2400 

1600 

1850 

2300 

3300 

1000 

1100 

50 

5800 

7300 

2650 

3000 

2100 

2350 

3000 

4000 

1250 

1350 

100 

7600 

9400 

3400 

3800 

2700 

3000 

4000 

5200 

1600 

1700 

b.  Fly>d  Damage:  Table  3 sunnarizes  selected  expected  annual  damage 
assessments  for  a range  of  conditions  and  land  use  control  policy  sets  for  damage 
reaches  within  the  Trail  Creek  watershed  that  sustain  significant  damages. 


18 


TABLE  3 

SELECTED  DAMAGE  ASSESSMENTS 
TRAIL  CREEK  TEST 


(Expected  Annual  Damage  in  lOOO's  $) 


EVALUATION  CONDITION 

DAMAGE 

REACH 

CODE 

LAND  USE  POLICY  V 

HYDROLOGY 

1 

2 

3 

TOTAL 

I 

Existing 

Existing 

(1974) 

1.5 

1.9 

11.9 

15.3 

X 

1990  with  no  devel- 
opment controls 

1990 

1033.3 

350. D 

32.7 

1416.0 

IV 

1990  with  new  dev- 
elopment at  1974 

100-year  flood 
level 

1990 

19.3 

63.8 

23. e 

106.9 

V 

1990  w/new  devel . 

P 1974  100-year 
& flood  proofed 
to  ground  floor 

1990 

16.8 

18.9 

4.7 

40.4 

VIII 

1990  w/new  devel . P 

1990  100-year  A flood 
proof  to  ground  floor 

1990 

11.9 

16.0 

2.8 

30.7 

]/  The  1990  land  use  condition  is  a projection  based  on  local  aqency 
judgment.  In  some  instances,  such  as  Danvine  Reach  3,  IR^O  urban 
type  development  has  displaced  some  present  agricultural  development. 


19 


r 


The  results  are  somewhat  surprising  and  at  first  glance  may  be  difficult 
to  understand.  An  initial  reaction  might  be  that  evaluation  condition  CODE  IV 
should  be  similar  to  CODE  I since  the  policy  of  no  new  development  occurring 
at  elevations  below  the  100-year  event  is  in  effect.  The  table  shows  a 
large  increase  in  expected  annual  damages.  This  increase  is  because  (1) 
damage  does  occur  for  new  basement  construction,  (2)  the  100- 
year  flood  for  1990  land  use  conditions  is  higher  than  the  100-year  flood  for 
existing  land  use  conditions,  and  (3)  damages  are  sustained  by  new  development 
from  events  that  exceed  the  100-year  event.  Several  other  evaluations  that 
Include  a number  of  alternative  control  and  flood  proofing  policies  are  in- 
cluded to  demonstrate  the  broad  capability  of  the  spatial  data  management 
technique  as  well  as  present  some  interesting  evaluations  of  policies  designed 
to  manage  flood  losses. 

c.  Environmental : The  environmental  assessment  results  selected  for 
illustration  emphasize  the  spatial  data  management  features  of  HEC-SAM.  The 
water  quality  and  land  surface  erosion  assessments  for  the  pilot  study  are 
undergoing  final  interpretation  and  are  thus  not  available  for  publication. 

The  coincident  tabulation  is  a first  level  of  analysis  and  is  comprised 
of  a data  display  cataloging  of  change.  The  concept  is  to  track  changes  in 
watershed  land  use  coincident  with  an  environmental  interpretation  of  the 
watershed.  The  coincident  analysis  may  be  performed  between  any  pairs  of  data 
in  the  file  for  any  areal  subdivision  (such  as  subbasins  or  damage  reaches). 
Table  4 is  a coincident  tabulation  for  the  existing  and  1990  land  use  conditions 
for  damage  reach  No.  2. 

The  values  displayed  in  the  Table  are  the  number  of  acres  which  are 
coincident  with  the  row  and  column  categories.  The  diaoonal  values  in  the 
matrix  are  the  number  of  acres  which  have  not  changed  land  use  classification 
from  the  existing  condition  to  the  1990  alternative  future  condition.  For 
example  in  row  1,  106  acres  that  were  classified  as  natural  vegetation  under 
existing  conditions  remain  so  classified  under  1990  conditions,  6 acres  of 
land  classified  as  natural  vegetation  under  existing  conditions  are  converted 
to  medium  density  residential  use,  etc.  The  total  amount  of  land  classified 
as  natural  vegetation  under  existing  conditions  is  272  acres  while  the  total 
amount  of  land  classified  as  natural  vegetation  under  the  proposed  1990 
condition  would  be  141  acres. 

Attractiveness  analyses  for  a number  of  potential  uses  within  the  basin 
were  performed.  The  output  is  standard  overprint  grey  shades  indicating  the 
relative  attributes  of  each  grid  cell  with  respect  to  the  others.  The  attrac- 
tiveness display  (not  shown)  used  to  illustrate  the  capability  was  for 
neighborhood  park  locations.  In  the  display  the  darker  shaded  areas  are,  in 
a relative  sense,  more  attractive  for  park  development  than  the  lighter  shaded 
areas.  The  data  variables  of  damage  reaches  (areas  within  the  flood  plain 
are  preferred  over  areas  outside  the  floodplain),  land  surface  slope  (flat 
and  mild  slopes  are  preferred  over  steep  slopes),  existing  land  use  (natural 


20 


liW 


J 


TABLE  4 

LAND  USE  COINCIDENTS 


OCONft  ^ItOT  fpt 

tSAlt  test 

uSC  COlHClOtHTS 

0*»«GE  bEaCh  2 

coincidents  naTRIx 


1 

2 

3 

• 

s 

b 

7 

• 

f 

to 

total 

1 

lOS.ST 

0,00 

0,00 

6,12 

0,00 

0,00 

ai.31 

27,54 

61,60 

o.oo 

272.34 

2 

0,00 

0,00 

0,00 

0.00 

0,00 

0,00 

0,00 

0,00 

0,00 

0,00 

0,00 

3 

0,00 

0,00 

0,00 

3.0b 

0,00 

0,00 

0,00 

0.00 

0,00 

0,00 

3,0b 

« 

0,00 

0,00 

0,00 

0,00 

0,00 

0,90 

0,00 

0,00 

0,00 

0,00 

0,00 

S 

0,00 

0,00 

0,00 

0.00 

0,00 

0,00 

0,00 

0,00 

0,00 

0,00 

0,00 

P 

• .12 

0,00 

0,00 

0,00 

0,00 

0,00 

6.12 

0,00 

^65 

0,00 

16,69 

t 

R.SR 

0,00 

0.00 

0,00 

0,00 

0,00 

13.77 

0,00 

4,56 

0,00 

22,65 

B 

0,00 

0,00 

0,00 

0.00 

0.00 

0,00 

0,00 

0,00 

0,00 

0,00 

0,00 

9 

2P.«» 

0,00 

0,00 

1.^3 

0,00 

0,00 

10.71 

6.16 

24,46 

0,00 

70,  la 

to 

0,00 

0,00 

0,00 

0,00 

0.00 

0,00 

0,00 

0,00 

0,00 

0,00 

0,00 

TOTAL 

iao,Tp 

0,00 

0,00 

10, Tf 

o,on 

0,00 

71.61 

36,72 

126.52 

0,00 

RO» 

CATEGrtBUS  ami 

existing 

LANU  USE 

COLUMN 

CtTECOntCS  •Kl 

IR60  LAND 

USE 

\ NATUBAI 

2 Dt'itLU*»tO  SOACt 

3 UO"  rtNSITV  MfJUSI*<G 

« *>tDtu**  OEnsITT  mOL'SINO 

S wjtM  DEnsITt  WOUSINC 

» ACBKULTumE 

T INOUSIRUL 

B commercial 

« rastunc 

10  MATtRbOOT 


1 MATU*AL  veGETATinN 

2 developed  OREn  space 

5 LOP  density  mousing 

0 medium  density  mQoSInU 

% HICM  density  housing 

b ACPICuLTUBE 

T industrial 

s commercial 

f pasture 

10  mATERBODY 


vegetation,  agriculture  and  pasture  are  favored  over  other  categories),  and 
distance  to  housing  (areas  near  the  low,  medium  and  high  density  residential 
areas  are  preferred  to  areas  removed  and  to  areas  near  other  land  uses)  were 
used  In  the  determination. 

6.  System  Running  Cost 

HEC-SAM  is  comprised  of  8 major  computer  programs  and  several  smaller 
ones.  The  cost  of  a run  or  analysis  Is  therefore  quite  dependent  upon  the 
specific  analysis  performed  and  the  computer  system  on  which  the  programs  are 
being  run.  In  addition,  computer  processing  involving  manipulation  of  data 
bank  grid  Information  is  dependent  upon  the  number  of  grid  cells  to  be  pro- 
cessed and  the  number  of  variables  included  in  the  analysis.  While  a defin- 
itive schedule  of  costs  is  difficult  to  specify,  the  following  Table  5 cost 
data  resulting  from  processing  related  to  the  pilot  study  should  provide  a 
general  order  of  magnitude  estimate  for  comparison  purposes.  The  processing 
was  performed  on  a CDC  7600  computer  system  with  special  government  rates 
averaging  about  $.12  per  computing  unit  second  (CUS). 


TABLE  5 

SELECTED  PROCESSING  COST 
Data  Management  and  Processing  Interface 


Operation 

No.  Cells  or 
Polygons 

No.  Variables 

CUS 

Cost 

$ 

AUTOMAP  II  Map 

18  Polygons 

1 

14 

1.68 

GRID  Map 

22,350  cells 

1 

43 

3.61 

RIA  Attractiveness 

Map 

11,868  cells 

3 

31 

2.87 

HYDPAR 

140,760  cells 

5 

178 

18.74 

DAMCAL 

140,760  cells* 

4 

95 

$11.53 

* 


Damage  Reach  cells  were  windowed  into  a mini  data  bank 
(Approximately  30,000  cells) 


22 


TABLE  5 Confd. 


Flood  Hazard.  Flood  Damage  and  Environmental 


Operation 

No.  Subbasins 

No.  Index  Points 

CDS 

Cost 

System  Hydro- 
graphs 

21 

5 

17 

$ 1.89 

Flood  Freq.  & 

Ex.  Ann.  Damage 

21 

5 

44 

3.78 

M H tl 

31 

22 

132 

$13.63 

7.  System  Implementation  and  Operation 


The  HEC-SAM  system  is  designed  to  be  used  by  Corps  District  offices 
servicing  local  communities  in  an  advisory  capacity.  The  system  would  either 
be  operational  on  local  computer  equipment  or  access  to  a central  computer 
where  the  system  is  maintained  by  HEC  would  be  available.  A typical  study 
would  be  undertaken  under  the  direction  of  a study  manager,  who  would  be  a 
Corps  planning  professional,  and  a minimum  of  one  each  of  a hydrologic  engineer 
economist,  environmental  planner  and  data  bank  manager  that  would  provide  the 
computer  system  support.  Local  planning  agency  staff  in  parallel  professional 
specialties  would  likewise  be  assigned,  especially  if  the  resulting  data  system 
and  analysis  programs  were  to  be  turned  over  to  the  local  agency.  A blend  of 
private  enterprise.  Corps  and  local  agency  efforts  v/ould  be  employed  to  create 
the  initial  data  bank.  Subsequent  maintenance  and  updates  would  be  the  joint 
responsibility  of  the  particular  specialty  area  and  the  data  bank  managers. 
Because  of  the  comprehensiveness  of  the  system  capabilities,  it  is  unlikely 
that  any  but  the  largest  local  agencies  would  have  the  capability  to  operate 
more  than  the  simple  data  management  features  of  the  system.  It  is  anticipated 
that  after  initial  alternative  future  assessments,  processing  would  probably 
be  performed  by  the  Corps  as  a continuing  service. 

The  present  status  of  HEC-SAM  will  require  intensive  training  and  continued 
consultation  and  support  from  HEC.  In  the  lonoer  range,  perhaps  2-3  years, 
the  system  is  expected  to  stablize  sufficiently  to  package  the  elements  and  make 
the  capability  available  on  a broad  basis  both  within  and  outside  the  Corps. 

8.  Future  Improvements 

Research  and  development  activities  to  date  have  concentrated  on  assembling 
a working  system  and  interfacing  the  functional  elements  into  a coherent  analysis 
package.  Several  items  have  surfaced  that  require  further  developmental  work 
that  could  enhance  the  systems  utility  and  capability.  The  list  below  is  judged 
priority  enhancements  that  are  currently  receiving  or  will  soon  receive  research 
attention. 


23 


1.  Automated  Interface  between  HYDPAR  and  HEC-1  to  enhance  calibratfon  and 
Improve  efficienty  of  hydrologic  assessments. 

2.  Expanded  graphical  output.  Current  system  output  is  computer  printer 
graphics  and  tabular  data.  Much  work  needs  to  be  done  to  provide  higher 
quality  graphics  (alternative  to  printer  graphics),  printer  graphics 
where  none  presently  exists,  i.e.,  flow  frequency  curves,  damage  curves, 
etc,  and  plotter  graphics  where  none  exists,  e.g.,  frequency  curves, 
damage  curves,  etc. 

3.  Consolidation  of  software.  From  a services  standpoint,  separate  software 
packages  for  grid  data  printer  plots,  polygon  data  printer  plots,  etc. 

is  cumbersome.  The  graphics  and  other  special  purpose  data  management 
programs  need  consolidating  to  more  serviceable  packages. 

4.  Capability  to  vary  grid  size  within  a data  bank  for  a specific  variable. 
It  is  unreasonable  that  the  least  count  resolution  that  applies  in  only 
a portion  of  a study  area  for  a specific  variable,  such  as  terrain  in 
the  flood  plain,  dictates  the  remainder  of  the  data  bank. 

5.  Enhance  automated  interface  with  storm  water  and  stream  quality  analysis. 

The  general  state  of  the  art  analysis  capability  that  has  been  captured 
in  HEC-SAM  greatly  enhances  the  ability  of  planners  to  assess  the  potential 
effects  of  change.  However,  many  technical  analysis  areas  are  still  in  their 
infancy  and  could  benefit  from  basic  research  efforts.  The  most  important 
appear  to  be  1)  techniques  to  improve  forecasting  the  hydrologic  impact  of 
urbanization,  2)  analysis  methods  to  assist  in  allocation  of  land  use  con- 
sistent with  projections  and  resource  capabilities,  3)  improvements  in 
erosion  and  sedimentation  analysis  and  4)  development  of  scientific  base 
for  environmental  habitat  analysis. 


24 


REFERENCES 


1.  "Phase  I Oconee  Basin  Pilot  Study,  Trail  Creek  Test,"  The  Hydrologic 

Engineering  Center,  September  1975, 

2.  GRID  MANUAL  Version  3,  Laboratory  for  Computer  Graphics  and  Spatial 

Analysis,  Harvard  University,  October  1971, 

3.  AUTOMAP  II  Users  Manual,  Environmental  Systems  Research  Institute 

Redlands,  California, 

4.  FY  1975  Annual  Report,  The  Hydrologic  Engineering  Center,  July  1975, 

5.  Honey  Hill,  A Systems  Analysis  for  Planning  the  Multiple  Use  of  Controlled 

Water  Areas,  IWR  Report  71-9, 


geographic  data  handling  ISSUES: 

AN  'ERNATIVi:  TO  VARIABLE  GRID  RESOLUTION 


r 


I 

( 

f 


By 

Kenneth  J.  Dueker^ 

Robert  H.  Erickson^ 

Evan  Noynaert^ 

INTRODUCTION  TO  SUGGESTED  PHILOSOPHY  OF  HANDLING  SPATIAL  DATA 

Grid  cell  encoding  of  data  Is  a well  established  technology,  which  emerged 
from  the  era  of  unit  record  electronic  data  processing  and  was  extended  by  modem 
computers  and  batch  processing,  where  the  grid  format  facilitated  programming. 

Grid  cell  configurations  became  a proven  technology  for  encoding  geographic 
data  Into  machine  readable  form.  However,  It  must  be  recognized  that  grid  cell 
encoding  was  an  outgrowth  of  processing  constraints  of  earlier  hardware  techno- 
logies. As  these  constraints  are  relaxed,  we  must  ask  whether  grid  cell  encoding 
Is  still  the  appropriate  technology. 

In  recent  years  batch  processing  has  been  supplemented  and  even  replaced 
by  the  Interactive  minicomputer.  The  University  of  Iowa’s  Department  of 
Geography  and  Institute  of  Urban  and  Regional  Research  are  studying  methodo- 
logies for  applying  Interactive  minicomputer  technologies  to  geographic  Informa- 
tion systems  applications.  This  exploration  of  new  technologies  has  lead  to  a 
reexamination  of  existing  grid  cell  methodologies.  Within  a minicomputer  and 
coordinate  digitizer  environment,  a team  of  researchers  at  the  University  of  Iowa 
are  developing  a geographic  encoding,  storage,  retrieval  and  analysis  package, 
which  could  have  application  to  the  Corps'  geographic  data  handling  needs.  The 
system,  dubbed  INFOS,  has  been  developed  to  the  level  where  Its  potential  Is 
clearly  demonstratable . The  system  may  be  of  Interest  to  the  Corps  because  It 
overcomes  many  of  the  problems  of  grid  cell  encoding  and  storage  without  resorting 
to  a multiple  resolution  grid  format,  nor  to  a polygon  overlay  system.  INFOS 
utilizes  a scan  line  format,  which  In  the  present  version  Is  derived  from  polygon 
data,  but  which  could  be  generated  from  direct  raster  scanning  of  source  documents 
In  a future  version.  INFOS  allows  the  user  to  select  resolution  for  overlay 
analysis,  which  precludes  the  need  for  a variable  grid  resolution  storage  system. 


^Director,  Institute  of  Urban  and  Regional  Research;  Professor,  Urban  and  Regional 
Planning  and  Geography,  University  of  Iowa. 

^Ph.D^  Candidate,  Department  of  Geography  and  Research  Assistant,  Institute  of 
Urban  and  Regional  Research,  University  of  Iowa. 

^Ph.D.  Program,  Department  of  Geography  and  Research  Assistant,  Institute  of 
Urban  and  Regional  Research,  University  of  Iowa. 


26 


i 


Several  large  automated  geographic  Information  systems  have  been 
developed  around  grid  cell  data  methods.  Including  the  Hydrologic  Engineering 
Center  of  the  U.S.  Army  Corps  of  Engineers.  Although  grid  cell  mapping  re- 
presents a proven  technology,  there  are  problems  Inherent  In  a grid  cell  based 
geographic  information  system.  The  Corps  Is  considering  refinement  of  Its 
current  uniform  grid  cell  system  Into  a variable  resolution  grid  coll  data  base. 
While  this  plan  has  merit  and  should  be  carefully  evaluated,  other  methods 
for  obtaining  the  Corps'  ultimate  objective,  efficient  and  effective  management 
of  geographic  data,  should  also  be  considered.  This  paper  will  briefly  review 
some  commonly  available  technologies  and  discuss  the  encoding,  storage, 
retrieval,  and  manipulation  system  developed  at  the  University  of  Iowa. 

Common  Data  Storage  Techniques 

Grid,  polygon,  and  topologlc  structuring  are  the  most  common  methods 
of  encoding  and  manipulating  geographic  data.  Technicians  continue  to  debate 
the  relative  merits  of  the  various  systems,  but  In  fact  no  one  system  can  be 
regarded  as  the  preferred  encoding  and  storage  method  In  all  situations.  (2) 

Grid  Systems.  The  cell  Is  the  basic  unit  of  grid  mapping  system.  The 
matrix  of  grid  cells  Is  produced  by  applying  a uniformly  structured  system  of 
grid  lines  over  the  map  area.  Each  cell  which  results  Is  tagged  with  the  pre- 
dominant characteristic  of  the  map  area  contained  In  that  cell.  This  method 
of  capturing  geographic  data  Is  conceptually  simple.  Grid  cell  data  structures 
can  be  very  useful  for  planning  and  other  purposes  whore  data  must  be  overlayed 
and  where  accuracy  Is  not  a prime  consideration.  In  addition,  map  output  can 
often  be  easily  and  Inexpensively  produced  on  line  printers  available  at  most 
computer  installations.  Originally  grid  mapping  was  motivated  to  facilitate 
programming,  but  Its  continued  use  Is  reinforced  by  supporting  technology  from 
satellite  Imagery,  raster  scanners,  and  special  purpose  computers.  (1)  Grid 
cell  data  systems  can  be  criticized  because  they  do  not  allow  precise 
placement  of  boundaries  and  points  on  the  map,  and  only  the  predominant 
characteristic  of  each  grid  cell  Is  usually  recorded. 

MSDAMP  (Multiple  Scale  Data  Analysis  and  Mapping  Program),  developed 
by  the  Hard  Use  Analysis  Laboratory  of  Iowa  State  University,  has  the  capability 
for  overlaying  grid  cell  map  Information  of  coverages  with  different  grid  cell 
resolutions  (but  each  coverage  must  have  a uniform  cell  size) . MSDAMP  uses 
the  geodetic  reference  sytem  (latitude  and  longitude)  and  defines  cell  sizes 
In  terms  of  seconds  of  arc.  The  geodetic  system  allows  regional  analysis  at 
almost  any  scale  at  or  above  the  county  level.  MSDAMP  Is  noted  for  Its  high 
quality  llneprlnter  maps  and  has  been  used  for  several  major  analysis  projects. 
(5,6) 


li 

li 


27 


Although  MSDAMP  can  handle  coverages  with  varying  grid  resolutions, 
each  individual  coverage  must  maintain  a uniform  cell  size.  To  get  the  effect 
of  multiple  resolution  with  one  coverage,  It  would  be  necessary  to  subdivide 
the  map  Into  areas  of  different  resolutions  and  then  concatenate  the  sections 
for  analysis . 

Coordinate  or  Polygon  Encoding.  Coordinate  encoding  Is  a second 
common  method  for  handling  geographic  data.  It  attempts  to  preserve  locational 
integrity  by  recording  the  x,y  coordinate  of  each  significant  feature  on  a map. 

The  encoding  of  a single  point,  such  as  a well,  is  very  straightforward:  the 
x,y  coordinate  of  the  point  is  recorded  along  with  an  Identifier  or  label.  A 
line  is  encoded  as  a series  of  discrete  points  located  at  significant  inflections 
of  a line;  the  points  are  mathematically  or  mechanically  connected  on  output 
by  the  computer  or  the  plotter.  Since  most  coordinate  mapping  routlras  connect 
the  points  with  straight  lines,  a curve  must  be  represented  as  a series  of  short 
line  segments.  If  the  intended  uses  of  the  data  require  an  exceptionally  high 
degree  of  locational  accuracy,  the  line  segments  must  bo  very  short. 

Polygons  or  areas  may  likewise  bo  encoded  by  recording  the  coordinates 
at  each  significant  inflection  point.  When  digitizing  polygons,  however,  the 
area  is  closed  by  giving  the  first  and  last  points  of  the  line  the  same  coordinate 
location.  A serious  problem  with  polygon  encoding  Is  the  necessity  to  digitize 
twice  those  segments  which  form  the  borders  of  two  areas.  For  example.  If  one 
were  digitizing  the  outlines  of  Arizona  and  New  Mexico,  the  part  of  the  outline 
of  each  state  which  forms  a common  border  would  have  to  bo  digitized  twice. 

Digitizing  devices  that  make  digitizing  straightforward,  efficient,  and 
accurate  are  commonly  available.  These  digitizing  machines  require  an  operator 
to  trace  the  map  with  some  type  of  stylus.  Most  digitizers  allow  rapid  encoding 
of  data  so  that  even  the  need  for  double  digitizing  boundaries  does  not  pose  a 
serous  problem.  Computer  programs  to  correct  many  small  digitizing  errors 
can  be  fairly  straightforward.  For  instance,  the  University  of  Iowa  system 
provides  for  both  computer  assisted  manual  editing  of  digitized  data  as  well  as 
a "Join"  program  which  allows  the  computer  to  form  boundaries  between  separately 
digitized  polygons. 

The  chief  advantages  of  coordinate  data  handling  systems  are  the  rapid 
encoding  of  data  and  the  preservation  of  locational  integrity.  Except  for 
extremely  complex  coverages,  coordinate  data  can  be  stored  In  the  computer 
much  more  efficiently  than  grid  cell  data  at  the  same  level  of  accuracy.  Outputs 
of  coordinate  maps  arc  most  accurately  performed  on  line  plotters  or  special 
cathode  ray  tubes.  Generally  the  resulting  maps  are  easier  to  read  than  grid  maps. 

Topoloqlc  Data  Structures.  A refinement  of  the  coordinate  or  polygon  system 
Is  the  topologlc  data  structure  which  represents  lines  and  polygons  as  a network 
of  nodes,  or  line  endpoints,  and  boundaries  connecting  the  nodes.  Because  area  data 
can  bo  represented  as  a boundary  network  and  roads  and  streets  can  be  represented 


?8 


i 


as  flow  networks,  topological  data  structures  are  useful  for  representing 
spatial  relationships.  In  addition,  logical  editing  of  topological  data  structures 
Is  employed  to  ensure  quality  control.  The  Census  Bureau's  topologlc  DIME 
file  Is  probably  one  of  the  best  known  geographic  Information  systems,  and 
Its  wide  use  has  encouraged  some  geographic  Information  system  designs  to 
adopt  a topologlc  data  structure. 

Like  coordinate  systems,  topologlc  data  strutures  preserve  locational 
Integrity  of  map  features.  Despite  the  fact  that  area  boundaries  only  need 
to  be  digitized  once,  topologlc  data  may  be  more  time-consuming  to  produce 
and  subject  to  larger  amounts  of  digitizer  error  than  simple  coordinate  data, 
since  an  Identifier  for  both  sides  of  each  boundary  line  must  be  entered  by 
the  digitizer  operator.  However,  topologlc  data  structures  do  not  usually 
require  that  lines  between  areas  be  digitized  twice. 


The  University  of  Iowa's  Digitizer  System 


The  University  of  Iowa's  Department  of  Geography  and  the  Institute  of 
Urban  and  Regional  Research  are  developing  methods  for  handling  geographic 
data  In  a minicomputer  environment.  Iowa's  "DIGIT  SERIES"  (4,8)  Is  a user 
oriented  system  which  has  been  on-line  for  about  one  year  with  capabilities 
for  coordlra  te  encoding,  storage,  retrieval,  and  manipulation  of  geographic 
data.  The  hardware  configuration  consists  of  two  minicomputers;  an  IMLAC 
Corporation  PDS-4  graphics  display  minicomputer  and  display  monitor  and 
an  HP- 2000  Access  system.  The  digitizing  process  Is  machine  assisted  by  the 
two  minicomputers  and  uses  a Graf/pen  3 digitizing  unit.  A communications 
link  exists  between  the  Hewlett-Packard  minicomputer  and  an  IBM  360/65 
computer.  This  link  Is  used  primarily  for  batch  jobs  to  gain  access  to  drum 
plotters,  line  printers  and  other  peripheral  equipment  attached  to  the  360  with 
the  option  of  retrieval  back  to  the  HP.  (3) 

In  response  to  the  question  raised  by  the  Hydrologic  Engineering  Center  of 
the  U.S.  Army  Corps  of  Engineers,  the  Institute  of  Urban  and  Regional  Research 
of  the  University  of  Iowa  Is  developing  INFOS,  a subsystem  to  DIGIT,  consisting 
of  routines  for  converting  polygon  data  to  a grid  cell  format  using  a simulated 
raster  scanning  algorithm.  This  algorithm  is  described  and  demonstrated  in 
Appendix  A.  DIGIT  has  proven  capabilities  for  machine-assisted  encoding  and 
editing  of  coordinate  data;  storage  and  retrieval  of  encoded  geographic  data; 
manipulation,  display,  and  analysis  of  encoded  data,  and  INFOS  adds  an  ability 
for  conversion  from  polygon  to  grid  format  allowing  overlay  and  analysis  of  grldded 
files.  The  unique  aspect  of  INFOS  Is  the  ability  to  create  scan  records  from  more 


29 


than  a single  coverage,  which  enables  generation  of  overlay  statistics. 

For  example,  INFOS  can  be  used  to  scan  a land  use  coverage  and  census 
tract.  A future  extension  of  INFOS  will  make  It  possible  to  handle  conti- 
nuous coverages  such  as  slope. 

The  chief  advantage  of  INFOS  is  that  analysis  resolution  Is  determined 
by  the  analyst.  Data  Is  stored  In  Its  original  coordinate  form  and  converted 
to  a scanned  file  at  the  appropriate  level  of  resolution  at  the  time  of  analysl; 
This  Is  deemed  more  efficient  than  grid  storage,  even  with  variable  grid 
resolution,  and  INFOS  resolution  can  be  varied  by  controlling  the  step  size 
In  y-dlmenslon  to  meet  the  specific  analysis  situation. 


THE  DATA  STRUCTURE  PROPOSED  FOR  THE  STORAGE  OF  THE  SPATIAL  DATA 

The  data  structure  and  retrieval  strategy  used  In  the  Iowa  system  Is 
based  on  hierarchical  storage  of  coordinate  files,  which  can  be  used  to  re- 
construct the  original  map  or  can  be  used  to  perform  analysis.  Storage  of 
data  In  the  form  of  coordinate  files  gives  greater  flexibility  In  the  types  of 
analysis  which  can  be  performed  and  In  most  cases  reduces  storage  regulre— 
ments . 

The  hierarchical  data  structure  of  the  coordinate  file  Is  shown  In 
Figure  1 . 


DISPLAY  OR  DATA  SET 


ELEMEhTr  ELEMENT  ELEMENT  ELEMENT  ELEMENT  ELEMENT 


ATOMS  ATOMS  ATOMS  ATOMS  ATOMS  ATOMS 


Figure  1 


The  "dtom"  Is  tho  b.islc  unit  of  tho  DIGIT  sortos  and  consists  of  tho 
x,y  coordlnati's  of  a single  point  on  a map.  Those  atoms  can  represent  an 
Isolated  point.  Inflection  points  of  a line  or  area  boundary,  or  the  location 
of  a label . 

Point,  line,  area,  and  text  "elements"  are  formed  from  groups  of  atoms 
of  the  same  type.  In  general  terms,  elements  can  represent  recognizable  map 
units  such  as  ro.uls  (a  line  element),  civil  divisions  (an  area  clement),  or  a 
point  soua'O  of  pollution  (a  point  element) . Text  elements  portray  legends 
such  as  captions  and  map  titles.  Other  data  is  contained  in  an  element  header 
describing  Us  characteristics  (size,  display  Intensity,  etc.)  followed  by  a 
list  of  atoms.  Appendix  B gives  tho  precise  Internal  format  of  an  element. 

For  storage  and  retrl(>val,  elements  are  grouped  Into  digitized  files. 

A file  may  contain  any  combln.itlon  of  point,  area,  lino,  and  text  elements 
Uiat  Is  convenient  to  the  user.  Usually  a single  file  contains  one  complete 
map  coverage  such  as  all  the  land  uses  in  one  USGS  7 1/2  minute  quadrangle, 
but  a file  may  include  only  part  of  a coverage.  I’or  example.  It  might  be 
usulul  to  build  one  file  of  .rgrlcultural  land  uses  in  a given  quad  and  put  all 
non-agrlcultural  land  uses  In  another  tile.  These  two  files  could  be  analysed 
separately  or  jointly  as  dictated  by  the  analysis  to  be  performed,  or  they  may 
be  combined  or  disaggregated  to  form  a new  file. 

Once  a file  has  bet'n  digitized  and  edited,  several  file  manipulation 
capabilities  are  available; 

--  One  or  more  files  may  be  submltteil  tor  analysis  without  dis- 
turbing or  altering  the  original  file. 

--  One  or  more  files  may  be  displayed.  If  multiple  Hies  are  used, 
they  may  be  overlayed  or  concatenated  (or  viewing  purposes. 

— A file  may  be  scaled  to  new  dimensions  or  placed  Into  some 
other  coordinate  system  such  as  state  plane  coordinates. 

— A file  can  be  copied  to  form  a new  file.  This  allows  file 
altering  procedure's  such  as  scaling  to  be  performed  on  a 
copy  of  the  file  while  leaving  th('  original  data  base  Intact. 

The  altered  file  may  become  a new  permanent  file  or  may  be 
erasi'd  after  tho  analysis  has  been  perfonned. 

— Elements  may  be  added,  deleted,  or  edited,  or  a new  file  created. 

By  using  tic  marks,  precise  overlays  and  concatenation  are 
possible. 

— A portion  of  one  or  more  files  may  be  "windowed"  on  the  display 
monitor. 


31 


— The  file  may  be  rotated  on  the  display  monitor. 


--  Coordinate  (lies  may  be  processed  to  produce  scan  files. 
The  scanning  may  be  done  at  any  resolution  appropriate  to 
the  current  analysis. 

— If  two  or  more  coordinate  files  are  scanned,  overlay 
statistics  may  be  generated. 

— Any  coordinate  or  scanned  file  may  be  added  to  or  deleted 
from  permanent  or  temporary  storage  at  any  time. 


THE  DATA  FLOW  OR  INFORMATION  TO  EACH  OF  THE  HYDPAR,  DAMCAL,  GRIDPLOT, 

AND  RIA  PROGRAMS 


After  map  data  has  been  encoded  and  stored  In  coordinate  form.  It  may  be 
used  In  Its  original  coordinate  form  or  connected  to  a' grid  structure. 


Information  flow  Is  Illustrated  In  Elgure  2. 


Cbnverslon  to 
Proper  Output 
Format 


LINEPRINTER  MAP  PRODUCTION 
or 

INPUT  to  HYDPAR,  GRIDPLOT,  etc. 


Figure  2 


I 


32 


Somo  tyvx's  ot  illspU'y  >>iul  ^in.ilyr.is  v-.tii  ho  imisl  I'ttoctlvoly  poitotnuv! 
using  coonlin.iU'  typi'  ol  tth'  stnictuio.  Voi  othoi  typos  ot  uiuilysls  tnoliulliHj 
Input  Into  HVPl’AK,  i.'iKll''l'W'r,  PAMi'M  . ,ind  KIA,  .1  >itU<  tllo  Is  nooossuiy. 

Input  to  tlio  I'oips'  pronTtums  would  lonulto  .»  pu-njuiin  to  oonvoit  tho  IM'OS 
gonoi.itod  so.^n  tllo  into  >1  HVDPAK  conip.itihlo  input. 

It  Is  inipoit.int  to  noto  thut  In  usiiui  1M\'S  it  may  not  ho  nooossaiy  toi 
d.^ta  to  (low  to  llYPPAK  and  otlu-r  Ooips'  piogiaiiis  . Most  analysis  and  display 
can  bo  portoiiiu'd  fiom  tho  scan  tllo  ot  coiuvlinato  UK's. 

Accosslng  data  on  tho  Hl'-JOOtl  mlnU'i'mputoi  used  at  tho  I'luvorslty  ot 
Iowa  Is  oxtiomoly  sltnplo.  I'ach  tllo  must  ho  ulvon  a uuh)uo  naiiu'  ot  up  to 
olovon  alphaiutmonc  chaiactois.  Tho  pn'vjiaiiis  which  .110  to  uso  tho  stoii'd 
tllos  moioly  call  tot  tlu'  naino  ot  I'ach  tllo  which  is  to  ho  usoot  tot  Input.  Sonao 
tyiio  ot  systomatlc  uamliui  piocoduto  would  douhtloss  ho  toguliod  tot  ostahllshing 
and  malnt.ilnlivi  a laigo  map  hhiary  such  as  th.it  noodod  hy  tho  i.'oips  ot 
Englne'ors . 


HOW  Tur  PATA  STKUOTV'RT  WOl'TP  OOM'ORM  TO  (.'OMPUTTK  P.RAPHIOS  THAT 
AKi:  hASTP  OK  A SINO.TT  OKIP  Ollll  KltSOll' flON 

Tho  Iowa  system  Is  pilmarlly  an  intoruotivo  system  huilt  around  tho  catluvlt' 
ray  tubo  display.  The'  pilmaty  display  unit  Is  an  IMt  Ar'  l'PS--l  minicomputer  and 
display  nionltoi  with  a .2 1 " scioon  and  a .'0-lii  x .'01;!  dlsplayahlo  lastoi  . I'lsplay 
on  tho  IMlAi.'  screen  is  sionoially  supoiloi  to  11m'  piintoi  graphics.  Photocopies 
ot  tho  display  can  ho  madi'  on  an  IMl  Ar'  li.iid  copioi  , measuring  appivixliiuitoly 
olglit  Inches  sguaro.  It  Is  also  v'osslhlo  to  have  maps  produced  on  11  " and  T9" 
drum  plotters,  an  oloctivstatlc  plotter,  01  a microttlm  plotter  . 

While  display  on  a r'KT  (or  a photocopy  ot  tho  monitor  screonl  Is  generally 
more  readable  than  a map  ptciducoil  using  liiu'  printer  graphics,  tin'  latti'i  has 
some  usolul  ti'afuros  and  are  sometimes  vleslrahh'.  line  prlnti't  maps  art'  usually 
Inexpensive  and  easy  to  produce  on  i-gulpmont  ii'adily  available  at  most  computer 
centers.  In  addition,  the  Hyilrologlc  Ptigtneerino  I'enter  has  expresst'd  an  Interest 
In  producing  outputs  -it  appn>xim.itt'ly  tin'  sann'  scalt'  as  the  source  USGS  7 1 T 
minute  guadrangh'  maps. 

I.lno  printer  output  w.is  not  an  otlgln.il  Inti-ntlon  In  the  development  ot  tlu' 
DIGIT  series,  hut  a simple  sottwari'  paclv.nii'  coulii  he  developi'd  to  produce  lltu' 
printer  compatible  output  Item  tiles  ptoiiuceil  hv  IKlW'l.  r'lirti-ntly  scan  llm'  d.ita 
are  Ivrstxl  on  a -1095  x •199.''  m.itrix  ot  "dlgitleer  units"  iPl'l  wht'te  one  dlglU.-et 
unit  equals  1 9.*'  Inch.  The  user  can  specitv  the  step  sl.'o  In  PU  tot  the  y 
dimension.  By  specilylng  a step  sl.-e  ot  11  illglti.-er  units,  a scan  tilt'  will  he 
produced  which  when  printt'd  on  a line  printer  set  at  eight  lines  pi'i  Inch  would 
proiluce  a map  with  the  same  y dimension  and  y scale  as  an  I'SGS  7 1 T minute 


33 


quadrangle  map.  Slmllaily,  11  the  usei  specilltHi  a step  size  In  the  y dimension 
of  14  DU  and  the  map  was  prvxluced  on  a line  printer  set  at  1/6  Inch  per  line, 
an  approximately  scaUxf  1 7 1 minute  quadiangle  map  would  be  pioduceil. 

The  cunent  veision  ot  INrc'^S  has  the  x dimension  continuously  scaUx’l 
rather  than  having  dlscic  te  step  sices,  as  exists  In  the  y dimension..  Conversion 
to  grid  tormat  Irom  a continuous  x axis  scaling  means  specifying  a hort.-ontal 
step  size  simtlai  to  whit  is  done  toi  the  y axis. 


ADVANTACVS  AND  DISADVANTAC.l'S  OT  I'ROl’OSTD  TTCHNICUTS  \'S.  A SINC.M' 
UNIFORM  CRID  RFSOl  UTK'N  THROUGHOUT  THF  FNTIRF  STUDY  ARFA  AND 

MULTIFI.F  RFSOLUTION  DATA 

The  PI1.HT  series,  pvvrtlculail>  with  the  development  ot  INFOS,  Is  a system 
for  machlne-asslsteti  encoding,  stoiage,  and  manipulation  ot  geographic  data. 

It  Is  based  on  coordinate  data  and  has  the  capability  ot  outputting  data  in  a grldded 
format.  The  analyst  may  window  In  on  the  coverages  netxled,  and  sot  the  resolu- 
tion to  any  level,  (.''verlay  analysis  and  mapping  Is  achieved  tiom  coonllnate 
format  data  and  pioblems  inhetent  in  the  use  ot  a unltorm  grid  cell  size  aie  avoided. 
Since  INFOS  employs  a constant  scan  line  increment,  It  Is  In  one  sense  analogous  to 
a uniform  gild  cell  sl-’c  thioughout  an  entire  study  area,  although  Its  Increment  Is 
use  determlntxi  and  set  at  the  time  and  tor  the  purpose  ot  analysis.  Nevertheless, 
INFOS  meiits  should  be  weighed  against  the  use  ot  a multiple  resolution  strategy. 

In  moving  trom  a unltoim  grid  cell  resolution  strategy  to  a multiple  resolution 
strategy,  many  of  the  adi-antag(>s  ot  a simple  unltorm  cell  size  (conceptual 
simplicity,  ease  ot  tUe  maintenance,  i-ase  ot  overlay,  and  rapid  piv’cessingl  are 
lost.  The  rationale  tor  sacrltlclng  tht-se  advantages  Is  to  achieve  greater  detail 
where  It  is  Important  or  whi'ie  coverages  aie  more  complex,  thtough  the  use  ot 
windowing  and  then  usliiii  INFOS,  these  same  advantages  can  be  achieved.  In 
arriving  at  the  INFOS  scatr  conversion  approach  vailable  grid  resolution  solutions 
woio  explored  and  rejected. 

In  systems  which  oveilay  viata  gildded  at  ditteient  lesolutlons.  pioblems 
of  aggrf*gatlor  and  pior.itioning  aie  introduced . Aggiegatlon  rs  the  combination 
of  small  areal  units  Into  laigei  ones.  I'loiationlng  Involves  dividing  large  aieas 
Into  smaller  ones  while  muini  lining  acv'uii'te  lepiesentatron  ot  the  data.  Whi'ii 
an  area  of  a sm.ill  grid  size  Is  oveilayed  with  a larger  grid,  the  two  must  bt' 
converted  to  some  common  cell  size  (the  pioblem  is  analogous  to  the  need  loi 
a least  common  denominaioi  to  add  tiactions).  In  most  cases  then*  is  no  souiivl 
analytical  basis  tor  prouitlonlng . To  maint.iln  the  thei'ietlcal  slgnltlcance  ot  the 
data,  the  small  cells  must  be  aggregati'd  to  thi'  eoatsei  level.  In  this  .iggieg.itlon 
process  any  le.solutlon  g.iinixi  by  usliuj  .sm.iH  gild  cells  Um  one  coverage  is  toi 
felted.  Vhls  data  loss  mitigates  any  .idvant.ige  achieved  by  using  a multiple  gild 
resolution  format,  INFOS  oveiconu's  this  piobU'in  by  allowing  the  data  to  be 


34 


scanned  at  whatever  resolution  Is  appropriate  for  the  current  analysis.  In 
addition,  all  data  can  be  stored  at  a fairly  similar  high  resolution  since  the 
storage  requirements  of  polygon  data  are  not  as  sensitive  to  the  level  of 
accuracy  as  the  storage  requirements  of  grid  data. 


CONCLUSIONS,  OBSERVATIONS  AND  PROGNOSTICATIONS 


The  Corps  of  Engineers  Is  a major  consumer  of  geographic  Information 
system  technology,  and  have  recognized  the  need  for  developing  new  methodo- 
I logles  to  overcome  the  problems  Inherent  In  grid  cell  mapping.  Variable  grid 

' resolution  encoding  represents  only  one  of  several  alternatives  now  faced  by 

f HEC.  Before  Investing  heavily  on  "bells  and  whistles"  to  grid  cell  analysis 

and  mapping  technology,  HEC  should  explore  a broader  range  of  alternatives, 
and  undertake  research  and  development  to  determine  the  best  methods  of 
achieving  an  effective  and  efficient  geographic  Information  system  to  support  the 
mission  of  HEC. 

INFOS  may  be  representation  of  emerging  geographic  Information  system 
that  rely  on  minicomputer  technology.  Interactive  operation  allows  analysts 
to  perform  their  tasks  more  efficiently  and  thoroughly.  As  an  alternative  to 
both  uniform  and  variable  grid  resolution  encoding,  INFOS  has  been  demonstrated 
for  small  coverages,  but  it  has  not  yet  been  tested  In  a large  data  base  environ- 
ment, Admittedly,  It  Is  still  In  the  research  and  development  stage,  but  INFOS 
or  a system  based  on  a similar  set  of  concepts  could  serve  as  the  base  of  the 
next  generation  geographic  Information  system  for  HEC. 


35 


APPENDIX  A,  TECHNICAL  DESCRIPTION  OP  ‘INFOS' 

Table  A.l  shows  the  list  of  current  INFOS  options  used  to  select  the 
major  routines  outlined  In  Figure  A.l.  INFOS  Is  one  of  several  application 
packages  In  the  DIGIT  SERIES. 

INFORMATION  SYSTEM  OPTIONS: 


(1)  CREATE  A SCANNED  FILE  FROM  DIGITIZED  FILE(s) 

(2)  DISPLAY  SCANNED  AREA  STATISTICS 

(3)  FIND  AREA  COMBINATIONS 

(4)  PLOT  AREA  COMBINATIONS 

(5)  SEND  HARD  COPY  TO  IBM360 

(6)  STOP/EXIT 

Table  A.l.  INFOS  SERIES  Options . 

For  purposes  of  the  present  discussion,  the  DIGIT  SERIES  creates,  edits, 
and  displays  a graphics  data  file  called  a DIGITIZED  FILE  which  contains  labeled 
and  unlabeled  point,  line,  and  area  "elements".  These  elements  are  digitized 
using  the  GRAF/PEN  digitizer  and  the  IMI*AC  PDS-4  minicomputer  and  once  pro- 
duced, may  be  combined  (overlayed) , disaggregated,  associated  with  tic  mark 
coordinates,  and  scaled  to  any  unit  system.  Each  area  polygon  In  these  DIGITIZED 
FILES  Is  scanned  from  top  to  bottom  using  a simulated  raster  scan,  resulting  In  a 
"scanning  record".  Using  a compressed  grid  format,  these  scanning  records  contain 
the  width  of  each  polygon  along  the  scan  line  together  with  the  corresponding 
area  label.  The  area  label  may  be  up  to  18  characters  long  and  contain  either 
nominal  or  Interval  data.  The  scanning  records  are  stored  as  a subfile  In  the 
SCANNED  FILE  and  are  used  to  compute  area  statistics  and  combinations,  and 
permit  creation  of  SHADE  FILES  for  plotting  selected  area  combinations . Area 
combinations  are  sets  of  contiguous  grid  cells  lying  within  one  or  more  digitized 
area  polygons.  Plotting  Is  accomplished  directly  on  the  IMLAC  screen.  In  which 
case  a photocopy  may  be  obtained,  or  the  SHADE  FILES  may  be  used  as  input  to 
other  software  In  the  DIGIT  SERIES  to  produce  11"  or  29"  CALCOMP  drum  plots  or 
35mm  microfilm  plots. 

Create  a Scanned  File.  Figure  A.  2 shows  a diagram  of  the  raster  scan  and 
scanning  record  produced  by  the  routine  outlined  In  Figure  A.  3.  After  Invoking 
option  1,  the  user  types  a list  of  DIGITIZED  FILE  names  to  be  overlayed  and 
scanned  simultaneously.  If  the  IMLAC  terminal  Is  being  used,  these  base  maps 
are  displayed  on  the  CRT.  The  coordlnotes  of  each  area  polygon  are  stored  on  a 
scratch  disk  file  (AREA  COORDINATES  FILE)  and  are  used  to  compute  the  area* 


•The  area  Is  computed  using  the  "polygon  method"  discussed  In  (7). 

36 


w 


INFOS  OPTIONS  Pig.  A, 3a,  Algorithm  to  CREATE  A SCANNED  FILE. 


39 


I 

I 


3b  (Continued) . 


SCANNING  RECORD' 


centroid,  and  the  range  of  X and  Y.  Together  with  the  area  label,  these 
parameters  are  stored  In  a second  scratch  file,  the  AREA  FARMS  FILE. 


I 

After  displaying  the  DIGITIZED  FILES  to  be  scanned  and  storing  the 
area  parameters  and  coordinates  on  scratch  files,  the  user  is  asked  to  supply 
two  additional  parameters:  the  Y-AXIS  STEP  SIZE  and  the  X-AXIS  TOLERANCE. 

The  STEP  SIZE  and  TOLERANCE  guide  the  scanning  process  which  is  discrete 
along  the  Y-axis  and  continuous  along  the  X-axis.  The  user  may  choose  any 
STEP  SIZE  to  a maximum  resolution  of  1 DU  (DU  = "digitizer  unit"  = 1/85  inch). 

The  width  of  a particular  area  combination  along  a scan  line,  or  "grid  width", 
less  than  this  TOLERANCE  is  omitted  from  the  scanning  record. 

Having  set  these  parameters,  the  SCANNED  FILE  header  is  written  and 
scanning  of  the  area  polygons  commences.  The  value  of  Y for  scan  line  J 
is  computed  and  the  AREA  FARMS  FILE  is  read  to  determine  which  area  elements 
are  intersected  by  the  scan  line.  An  area  is  intersected  if  its  maximum  Y-coordl- 
nate  is  greater  than  I and  its  minimum  is  less  than  J.  If  Intersected,  the  AREA 
COORDINATES  FILE  for  area  K is  read.  Corresponding  to  a line  segment  in  area 
polygon  K,  each  adjacent  pair  of  coordinates,  (x,y)^  and  (x,y)i+i^  is  checked 
to  see  if  it  is  crossed  by  the  scan  line.  If  crossed,  the  X-coordlnate  of  this 
intersection,  {Xj^,Y)j,  is  computed  and  stored  together  with  the  area  identifier 
in  vector  <5bj.  Where: 

K = 1,  • • • "no.  of  area  polygons 

1=1,  ' • • *no.  of  atoms  in  area  K 

J = Y coordinate  of  scan.  Incremented  by  STEP  SIZE 

L = 1,  • • • -no.  of  intersections  along  scar,  line  J 

After  analyzing  each  line  segment  in  each  relevant  area , the  vector  of  intersections 
is  sorted  from  smallest  to  largest  and  each  intersection  is  then  displayed  on  the 
IMLAC  screen.  The  distance  between  each  intersection,  or  the  grid  width 

“ ^L^p  Is  paired  with  the  label  of  each  polygon  crossed  to  form  the  scanning 
record  for  grid  row  J.  (See  Figure  A.  2)  The  scanning  record  I is  then  written  onto 
the  SCANNED  FILE  and  optionally  displayed  on  the  screen,  and  J is  Incremented 
by  the  STEP  SIZE. 

When  the  scanning  process  is  completed,  the  AREA  FARMS  FILE  is  appended 
onto  the  end  of  the  SCANNED  FILE  to  produce  the  second  subfile  which  is  used  in 
other  options.  The  user  is  then  requested  to  choose  the  next  elective  from  the 
INFOS  option  list. 


Display  Scanned  Area  Statistics.  After  a SCANNED  FILE  Is  produced, 
summary  statistics  of  each  area  scanned  may  be  displayed.  Option  2 reads  the 
AREA  FARMS  subfile.  For  the  hypothetical  example  in  Figure  A. 2,  the  statistics 
may  appear  as  follows: 


AREA  STATISTICS 


SCANNED  FILE:  'VENN* 


SCAN  OF  DIGITIZED  FILE(S): 
ORIGIN: 

TICS: 

NUMBER  OF  AREAS: 

STEP  SIZE: 

OVER  ALL  AREAS 
X-AXIS: 

Y-AXIS: 

SUM  OF  AREAS  (POLYGON 
METHOD): 

CENTROID: 

AREA  1 
LABEL: 

LABEL  COORDINATE: 
NUMBER  OF  POINTS: 

AREA  (POLYGON  METHOD): 
X-AXIS: 

Y-AXIS: 

CENTROID: 


■VENNA',  'VENNB',  'VENNC 

0000,0000 

0000,0000  and  1000,1000 
3 

25 


0273  TO  0727 
0273  TO  0727 

300,000  SDUs 
0500,0500 


•A' 

0410,0500 

1500 

100,000  SDUs 
273  TO  525 
400  TO  727 
0400,0525 


(Table  continues  for  other  areas) 


Table  A. 2.  Area  Statistics  Table. 

SDU  stands  for  "square  digitizer  units"  and  may  easily  be  converted  to  more 
convenient  areal  units  such  as  acres  or  square  miles.  A HARD  COPY  of  this 
table  may  also  be  optionally  produced  and  submitted  to  the  IBM360  for  a 
printout. 

Find  Area  Combinations.  Option  3 accomplishes  two  primary  purposes: 
(1)  to  create  a temporary  “combinations  file"  which  Is  a direct  access  disk 
file  containing  the  label  and  estimated  area  of  each  unique  area  combination 
In  the  SCANNED  FILE,  and  (2)  to  display  and  produce  a HARD  COPY  of  the 
table  of  area  combinations  like  the  one  in  Table  A.  3.  The  first  time  this 
option  Is  selected  for  a particular  SCANNED  FILE,  this  combinations  file  Is 


42 


r 


i 

\ ' 

r 

1; 

|: 

r 

f, 

I 

!■ 

i 

1 

t 

! 


appended  onto  the  end  of  the  SCANNED  FILE  so  that  these  combinations  need 
not  be  computed  again.  The  combination  subfile,  along  with  the  scanning 
record  and  area  parms  subfiles,  Is  necessary  in  order  to  execute  option  4.  The 
estimated  areas  In  Table  A.  3 are  computed  by  a different  method  than  those 
shown  In  Table  A.  2 which  does  not  display  the  areas  of  polygon  overlap.  This 
second  approach  estimates  the  area  of  area  combinations  by  summing  the  pro- 
duct of  the  grid  widths  and  STEP  SIZE  for  a particular  combination. 


REF.  # 

1. 

2, 

3. 

4. 

5. 

6. 

7. 


AREA  COMBI 

LABEL  COMBINATIONS 

■A* 

■B' 

'C 

'A'  + 'B' 

■A'  + 'C 
■B'  + 'C 
•A'  + 'B'  + 'C 


A T I O N S 

ESTIMATED  AREA 

100,000  SDUs 
100,000  SDUs 
100,000  SDUs 
40,000  SDUs 
40,000  SDUs 

40.000  SDUs 

15.000  SDUs 


Table  A.  3.  Area  Combination  Table. 

'A*  Is  that  part  of  area  'A'  not  overlapped  by  another  area;  'A'  + 'B'  refers  to 
only  those  grid  cells  lying  inside  both  areas  'A'  and  'B'.  This  table,  which 
may  be  rather  long  for  realistic  projects,  may  be  submitted  for  HARD  COPY  to  the 
IBM360.  Depending  on  the  STEP  SIZE,  the  estimated  areas  In  this  table  are  usual- 
ly within  1%  of  the  areas  estimated  using  the  polygon  method. 

Plot  Area  Combinations.  Option  4 can  be  selected  only  after  the  area 
combinations  subfile  has  been  appended  onto  the  end  of  the  SCANNED  FILE 
using  option  3.  For  each  combination  the  scanning  records  are  read  and  grid 
cell  coordinates  computed.  These  coordinates  are  then  used  to  contruct  a 
SHADE  FILE.  A SHADE  FILE  Is  a DIGITIZED  FILE  containing  unlabeled  line 
elements  of  the  scan  lines  contained  within  a chosen  area  combination  (grid 
widths) . The  user  may  select  one  or  more  area  combinations  to  be  plotted. 

The  principal  steps  In  constructing  SHADE  FILES  are  as  follows: 

(1)  The  DIGITIZED  FILES  comprising  the  base  map  used  In  producing 
the  SCANNED  FILE  Is  displayed  on  the  IMLAC  screen. 

(2)  The  user  selects  one  or  more  area  combinations  from  a table  to 
be  plotted  (with  or  without  any  overlap  for  each) . 

(3)  Each  scanning  record  in  the  SCANNED  FILE  Is  read  and  the 
coordinates  of  each  scan  line  In  the  selected  area  are 
recorded  ("shade  lines"). 


) 

I 


43 


(4)  The  shade  lines  are  plotted  on  the  IMLAC  screen  for  all 
combinations  chosen. 

(5)  The  user  may  request  a photocopy  of  the  screen  and/or  store 
the  SHADE  FILE(S). 

(6)  The  user  may  repeat  the  procedure  from  #2  above  and  optionally 
erase  any  shade  lines  from  the  screen  while  retaining  the  base 
map. 

The  SHADE  FILES  may  be  displayed  together  with  the  base  map  at  a later 
time  or  they  may  be  submitted  to  the  IBM360  for  a CALCOMP  drum  or 
microfilm  plot. 

Option  5,  SEND  HARD  COPY  TO  IBM360.  submits  the  HARD  COPY 
FILE  to  the  line  printer.  The  HARD  COPY  FILE  is  cumulative  In  that  each 
table  listed  using  options  2 or  3 for  any  number  of  SCANNED  FILES  Is 
appended  onto  the  end.  When  the  user  wishes  to  exit  from  INFOS,  he 
selects  option  6,  STOP/EXIT,  whereby  control  Is  returned  to  the  DIGIT 
SERIES. 


PT 


APPENDIX  B.  STRUCTURE  OE  DIGITIZED  TILES  (4) 


Tho  coordinate  tiles  contain  all  the  Information  necessary  to  display 
Its  contents  including  labesl,  scale  factors,  tic  marks,  and  origin  point. 
The  entire  file  Is  composed  solely  of  character  strings  and  Includes  (1) 
XThclLr.  (2)  element  file  headers,  (3)  a list  of  coordlna  es  a..d 
labels,  and  (4)  clement  delimiters . The  basic  structure  is  as  follows. 


(ORIGIN),  (SCALE,  (TIC  MARKS), 

* 

(ELEMENT  TYPE) , (ELEMENT  SEQUENCE) , 

(NUMBER  or  ATOMS),  (ATTRIBUTES), 

* 


(Next  e lement) 


- The  “file  header"  . 

- Element  delimiter. 

- Element  header. 

- Coordinates 

- Element  delimiter. 


- End  of  file  mark . 


The  following  example  shows  the  structure  of  a typical  digitized  file 
containing  an  area,  labeled  line,  labeled  point,  and  text  elements; 


0000  0000,  0001 
4095,  *,  A,  1, 
DL107.0  3I15,  3452 
1211,  *,  L,  2, 
1221 ’RIGHT  LABEL' 
3,  0 0000  0000 
3442  2232’ POINT 
0000  L09 


0001,  0000  0000  4095  0000  4095  4095  0000 

5 0 0000  OOOOA  3212  1203'AKEA  IJVBEL' 

3422  0023  1283  3302  1 189  0033  1221  1190 

2,  0 0000  OOOOT  0212  31 22'Li:rT  WBEL’  R 0021 
'l21  Z02110,  1232  2210  2209  0001,  *,  P,  3, 

L21  ’S03  7.03115,  1278  3223T01NT  1’, 

2’  0093  3432’PQlNT  3',  * T.  4,  1,  0 0000 

Z03I03,  2332  3442’TEXT  ATOM’ , * S 


The  ORIGIN  Is  0,0  for  X,Y,  the  SCALE  is  1 , 1 forX,V,  and  the 
default  TIC  MARKS  are  shown  In  the  file  header,  lollowlng  this,  e 
Indicates  that  an  element  is  beginning.  The  "A"  means  that  the  eletnen 
area  clement,  "1"  and  "5"  means  that  it  is  the  first  element  in  the  file  and 
contains  tivc  atoms.  The  next  string  is  composed  of  95  characters, 
which  are  blank.  The  element  origin,  the  position  of  the 

Other  characteristics  arc  Indicated  in  the  "attribute  string  . Iho  D means 
“hat  the  area  symbol  Is  a dashed  line,  and  "7.03115"  denotes  that  the  element 
has  a label  size  of  three  with  Intensity  15.  The  five  atoms 
x.y.)  appear  of  the  coordinate  string.  In  the  case  of  labeled  points  and  text 


45 


elements,  the  labels  are  contained  within  the  coordinate  strings  and  delimited 
by  ai>ostrophes . The  symbol  type  for  point  elements  Is  Indicated  In  the 
attribute  string;  "SOI"  denotes  a triangle.  "S02"  a square,  and  "S03"  a circle 
symbol.  In  labeled  elements,  the  attribute  string  also  contains  the  total 
number  of  characters  In  all  labels.  "LIO"  would  Indicate  that  the  element 
contains  a total  of  ten  characters.  All  coordinate  strings  contain  alphanumeric 
numbers  In  digitizer  units  relative  to  the  element  origin. 


f 


Kt^tertjncos 


1.  ChrlsniJn,  N.,  “Topoloylcal  C.oogiaphlc  KovMosontation , " 

Proco tilings,  Urban  and  Regional  Infoimatlon  Systems  Association 
TuRISA),  volume  1,  August,  1975,  194-202. 

2.  Dueker,  K.T.,  "Geographic  Pata  I'ncodlng  Issues,"  liichruc^ 

Report  il43.  The  Institute  ot  Urban  and  Regional  Roseatclr,  University 
of  Iowa,  Iowa  City,  Iowa,  .April  1975. 

3.  Pucker,  K.].,  and  Noynaeit,  1‘.,  "Interactive  Plgltl^lng,  l.dltlng  and 
Mapping;  Pata  Stmeture  Considerations , ' Prepared  tor  Advancixl 
Study  Symposium  on  T’opologlcal  Pata  Stiuctures  tot  C'eographlc 
Information  Systems,  The  l.aboratory  lot  Computer  Graphics  and  Spatial 
Analysis,  Hartford  University,  f\:tobei  lb-21,  1977. 

4.  Erlcksen,  R.H.,  ' PICIT  SPRITS  U ser' s Manual,"  Pepartment  of 
Geography,  University  of  Iowa,  Iowa  t'lty,  Iowa,  September,  197b. 

5.  Land  Use  Analysis  l.aboratory,  “Status  Report  of  the  Land  Use  Analysis 
Laboratoty , " Iowa  Agrlcultuial  and  Home  Tconomlcs  Station,  Iowa 
State  University  ot  Scietrce  and  Technology,  November  1973  thru 
November  1974,  1-3,  5ti-7t>. 

6.  land  I’se  Analysis  Laboratoty,  "Research  ConsUieratlons  - A Land 
Information  System  Suppoit  Tiogram,"  Iowa  .Agricultural  and  Home 
Economics  Station,  Iowa  State  University  ot  Science  and  Technology, 
Amos,  Iowa,  1977,  1-4,  92-102. 

7.  Tomlinson,  R.T.,  »'t  al , "Computer  Handling  of  Geographic  Pata, 

Natural  Resource  R»'search  Xlll,  v> . S 1 . 

8.  Werner,  R.  , "Plgltlrrer  System  User's  Guide,"  Pepartment  ot  Geography, 
University  of  Iowa,  Iowa  City,  Iowa,  May  197b. 


] 


GEOGRAPHIC  DATA  HANDLING  ISSUES: 

AN  ALTERNATIVE  TO  VARIABLE  GRID  RESOLUTION 

DISCUSSION 


1 


Conment:  The  main  accomplishment  of  this  new  system  is  that  the  polygons 

are  overlayed  and  a multivariable  file  is  created  directly,  instead 
of  taking  each  polygon  file  separately,  gridding  it  and  then  merging 
the  grid  files  to  form  a multi variaole  file.  This  is  unique.  Only 
variables  of  interest  need  to  be  overlayed  and  grids  are  generated 
at  the  desired  resolution  independent  of  other  overlays.  The  user 
gets  to  make  the  resolution  decision  A priori  to  each  overlay  analysis 
instead  of  choosing  a resolution  at  tFe  beginning  of  data  collection. 


Q.  Can  you  vary  the  step  size  within  a single  run  of  an  overlay? 

A.  We  do  not  currently  have  this  capability;  it  requires  uniform  steps 
(Y  dimensions)  of  the  variable  resolution.  Variable  resolution  can 
be  achieved  by  windowing  the  various  areas  of  interest. 


Q.  The  HEC  experience  has  been  to  capture  data  in  the  polygon  form, 

create  grid  cell  data  for  each  and  then  merge  the  grid  files  to  form 
a multivariable  file  and  then  we  do  engineering,  economic  and  environ 
mental  analysis.  The  technique  described  so  far  seems  to  emphasize 
the  creation  of  the  multivariable  file  at  the  analysis  phase. 

A.  (Dangermond)  This  is  true  except  there  is  a reduction  in  job  steps 
in  the  creation  process  which  saves  time  and  also  when  you  grid  all 
the  polygons  at  once  you  get  away  from  some  of  the  practical  problems 
of  resolution  of  cartography  and  doing  them  all  at  once  lends  itself 
to  a new  file  structure  which  is  what  Ray  and  I will  discuss  this 
afternoon. 


Comment:  There  is  an  aspect  of  the  problem  which  is  a subtlety  I (Dueker)  have 

trouble  grasping;  that  is  the  extent  to  which  the  grid  cell  data  bank 
becomes  the  point  of  departure  for  all  analysis,  vs.  deciding  which 
separate  coverages  need  to  be  combined  for  a particular  analysis,  and 
then  make  the  decision  as  to  the  resolution  needed.  I guess  I am 
saying,  do  the  different  analysis  programs  such  as  HYDPAR  and  DAMCAL 
have  different  data  resolution  requirements?  If  they  do,  then  our 
approach  would  have  more  power  than  a fixed  grid  cell  data  bank. 


48 


Then  in  the  context  of  the  objectives  of  the  seminar,  you  are  dealing 
with  between  variable  resolution  and  not  wi thin  variable  resolution? 

Yes,  except  for  our  windowing  and  scaling  capability  which  allows  one 
to  do  within  variable  resolution. 


Comment:  (Postma)  There  is  variable  resolution  from  a data  compaction  scheme, 

which  uses  rectangles  of  different  lengths  as  the  mechanism  to  capture 
the  variable  resolution  in  the  X direction. 


What  kind  of  capability  in  terms  of  storage  and  processing  time  is 
involved?  Is  it  in  your  best  interest  to  continually  overlay  poly- 
gons prior  to  the  analysis?  In  our  situation  it  seems  that  gridding 
the  data  is  a major  expense  and  that  we  would  prefer  to  do  that  as  a 
prestep  prior  to  any  analysis  and  store  it  once  than  to  continually 
do  that  overlay. 

Another  thing  is  that  we  like  to  do  each  data  variable  separately 
so  that  we  don't  have  to  know  where  everything  else  is. 

Yes,  there  is  a problem  when  overlays  are  made  and  the  computer  cost 
increases  with  the  increasing  complexity  of  the  overlays.  But  the 
storage  requirements  do  not  go  up  because  we  deal  with  one  line  at  a 
time.  That  is  a key  factor.  Since  we  use  a "mini",  you  can  let  them 
grind  away  for  long  periods  of  time  fairly  inexpensively.  We  are, 
however,  still  in  an  R & D mode  and  further  work  needs  to  be  done. 
There  needs  to  be  an  investigation  of  pregridding  versus  gridding  at 
the  time  of  analysis. 


Every  time  our  organization  has  tried  to  do  a cost-benefit  analysis 
for  an  interactive  scope  or  tube,  to  let  us  interactively  edit  files, 
we  have  always  come  up  with  the  result  that  it  is  not  cost  effective. 
Is  it  practical  for  you  to  have  interactive  equipment,  a production 
environment  which  digitizes  thousands  of  complex  polygons  daily? 

Our  experience  is  more  in  the  education  side  where  it  is  very  bene- 
ficial for  the  naive  user  to  be  able  to  see  just  what  is  going  on. 

I would  speculate  that  it  would  be  very  useful  in  the  editing  process. 
Much  editing  can  be  performed  in  batch  mode  (eg.  join)  as  well  as  the 
gridding,  although  we  have  not  developed  the  software  to  do  this. 


You  can't  overlay  complex  maps  on  the  tube  with  the  original  map. 
Therefore,  we  need  a huge  screen  or  else  it  is  worthless. 

I agree.  (Oak  Ridge  National  Laboratory  is  projecting  back  onto  the 
original  map  from  a video  tape  of  the  CRT). 


Q.  Mow  are  polygon  intersections  and  neighborhood  relationships  deter- 
mined by  this  system?  Say  there  are  two  polygons  (A  and  11)  and  they 


look  like  the  ligwrc.  There  are  two  intersection  polygons  All.  and 
ABp,  Does  the  system  know  that  they  are  actually  intersections  of 
the  same  polygons  or  do  they  treat  them  as  two  separate  intersections? 

The  system  treats  each  intersection  separately.  Mow  important  (his 
is,  I don't  know. 


r.Enr.RAPHIC  data  haniii  inc.  issues: 

AN  alternative  TO  VARIABLE  C.RIO  RESOLUTION 

AUTHOR'S  AOOITIONAL  COMMl NTS 


Although  the  current  INEOS  routine  scans  only  area  polygons,  the 
extension  to  include  laheled  lines  and  points  is  planned  which  will 
enable  conversion  of  point  and  linear  features  to  a grid  format. 

This  would  be  particularly  useful  for  land  use  systems.  The  software 
which  does  the  "Raster  scanning"  is  satisfactorily  developed.  There 
are  a number  of  potential  extensions  of  the  system  which  process  the 
associated  18  byte  labels  (which  currently  store  data  associated  with 
polygons).  Tor  example,  contained  in  the  label  may  be  flags  which 
direct  processing  the  data  in  particular  ways  (e.g.  to  specify  a search 
radius  to  look  for  adjacent  point  infonnation  for  analysis  of  topo- 
graphic surfaces).  Such  application  programs  must  be  tailored  to  the 
intended  function,  of  course.  Another  extension  could  be  the  inter- 
polation of  labeled  point  information  over  a labeled  area  polygon  to 
compute  slopes  as  shown  below:  At  intervals  along  the  scan  line, 
interpolations  could  be  made  from  surrounding  labeled  points  to  com- 
pute the  value  of  the  interval  (e.g.  grid  ceil). 


scan 
1 ine 


area 

polygon 


lal'i'I  I'd 
po  1 nf  •. 


The  main  advantages  ot  INtOs  ,ire  a-,  tollov: 

(a)  Cheap,  fast  t’olygon  to  grid  lOnverMon,  with  mpui)  faneous 
overlay 

(b)  Multivariable  data  stored  in  one  tile 

(c)  Optional  interactive  control  display 

(d)  Variable  length  file  records 

(e)  Very  little  core  roguired 


'•I 


2. 


Drawbacks  of  the  current  system  include: 


k 


(a)  Software  is  somewhat  hardware  specific  (HP  - BASIC),  although 
convertable  to  PL/ I 

(b)  Resolution  maintains  a constantly  high  level,  hut  cannot  be 
varied  within  the  same  run  along  the  Y-axis  without  windowing 
or  scaling  as  currently  written 

(c)  Additional  software  would  be  rei|uired  to  analyze  slope  data  and 
many  other  specialized  applications 

(d)  Few  data  analysis  routines  have  been  written,  except  for  finding 
the  areas  of  polygon  overlays,  centroids,  etc.  No  attempt  has 
been  made  to  link  INFOS  to  hydrologic  models. 

(e)  Currently,  data  is  stored  in  label,  but  label  could  point  to 
other  data  vectors  or  parameters  for  analysis  purposes. 

3.  HEC  should  begin  moving  away  from  a grid  cell  data  bank  and  begin 
implementation  of  the  following  procedures:  (minimize  dependence 
on  grid-then-store  approach  that  fixes  resolution) 

(a)  Generate  HYDPAR  parameters  directly  from  a digital  terrain  model 
(grid  or  triangular  as  determined  by  a special  investigation) 

(b)  Explore  the  use  of  a man-machine  environment  for  damage  calcu- 
lation (DAMCAL/HEC-1 ) . Man's  judgement  and  computer-assistance 
would  combine  for  more  effective  analysis:  (immersion  into  a 
mi ni -computer/ i nteracti ve  en vi ronment ) 

1)  Select  a window  representing  a flood  plain  reach  for  an  i 

analysis 

2)  Create  working  files  from  relevant  coverages  for  window 
area 

3)  Compute  flooded  area  for  flood  under  consideration 

4)  Scan-line  overlay  analysis  to  determine  amount  of  land  use, 
by  type  flooded. 

4.  lURR  role 

(a)  Further  research  and  development  of  scan-line  technology  ■ 

(b)  Man-machine  environment  research 

(c)  Nested  land  use  classi f ication  research 


5? 


VARIABLE  GRID  RESOLUTION  - ISSUES  AND  REQUIREMENTS: 
THE  ADAPT  SOLUTION 


By 

W.  E.  Gates  and  Associates^ 


INTRODUCTION 

The  Hydrologic  Engineering  Center  of  the  US  ARMY  Corps  of  Engineers  has 
been  engaged  in  automated  spatial  data  processing  for  purposes  of  hydrologic 
analysis  and  planning,  utilizing  a regular  grid  system  in  conjunction  with  a 
series  of  analysis  programs  and  models.  HEC  has  requested  proposed  solutions 
to  the  problem  of  incorporating  a variable  resolution  capacity  into  their 
overall  process.  W.  E.  Gates  and  Associates  is  pleased  to  present  what  we 
are  confident  is  a workable,  efficient,  compatible,  and  proven  solution 
to  this  problem. 

The  ADAPT  system,  developed  by  personnel  of  WEG/A  over  the  past  five 
years  and  applied  to  some  50,000  square  miles  in  five  states,  is  a geographic 
Information  system  designed  primarily  for  use  with  engineering,  planning, 
and  design  models,  in  particular  where  terrain  considerations  are  important. 

The  ADAPT  system  is  based  upon  the  concept  of  triangular  cells,  of  varying 
size  and  shape.  Each  cell  is  defined  in  space  by  the  three-dimensional 
coordinates  of  its  vertices.  This  approach  provides  a geographic  information 
system  with  variable  resolution  and  excellent  capacity  for  terrain  modeling. 
Because  of  the  cell  orientation  of  the  ADAPT  system,  models  are  easily  inter- 
faced with  the  data  base.  The  variable  resolution  capacity  combined  with  the 
unambiguous  definition  of  a plane  in  space  of  the  triangle  cell,  makes  possible 
the  terrain  modeling  capacity  so  essential  for  engineering  use  of  the  system. 
Thus,  the  solution  of  the  variable  resolution  problem  embodies  within  itself 
a solution  to  the  terrain  modeling  problem.  The  triangulated  cell  system 
utilized  in  ADAPT  also  provides  for  storage  and  manipulation  of  point,  line, 
polygon,  and  network  data. 

The  cell-based  organization  of  ADAPT  is  similar  to  that  currently  used 
by  HEC  in  that  each  cell  record  contains  within  itself  Information  on  all  the 
data  variables  referenced  to  that  cell.  An  ADAPT  file  can  be  processed 
sequentially  or  be  direct  access  methods,  allowing  great  flexibility  in  the 
design  of  models  which  access  the  data  base,  and  also  allowing  for  economies 
in  use  of  core  storage. 

Because  the  organization  of  data  is  similar  to  that  utilized  currently 
by  HEC,  and  because  ADAPT  files  can  be  processed  sequentially  (as  are  HEC 
files),  two  options  present  themselves: 


M.  Males,  Vice  President,  1515  Batavia  Pike,  Batavia,  Ohio  45103 


53 


Che  existing  HEC  programs  operating  on  the  regular  grid  cell  data 
base  could  be  converted  (probably  without  major  difficulty)  to 
operate  directly  upon  the  triangle  cells 


or 

the  triangle  cells  could  be  used  to  capture  terrain  and  other  highly 
variable  resolution  data,  and  an  Interface  written  to  either  Insert 
this  data  directly  Into  a regular  grldded  data  base  or  allow  for 
access  from  the  existing  HEC  programs. 

The  ADAPT  system  has  been  described  extensively  in  prior  presentations  and 
submittals  to  HEC  (Reference  1).  The  remainder  of  this  paper  contains  a 
brief  summary  of  the  ADAPT  data  structure,  the  proposed  data  flow  to  the 
HEC  programs,  a discussion  of  the  computer  graphics  requirements/constraints 
of  Che  solution,  Che  relative  advantages  and  disadvantages  of  the  methodology, 
and  a conclusion  and  summary. 


ADAPT  DATA  STRUCTURE 


I 


An  ADAPT  data  base  consists  of,  at  a minimum,  two  files,  a triangle  file, 
and  a vertex  file.  Other  files,  such  as  polygon  files,  may  be  maintained, 
but  are  not  strictly  necessary.  The  triangle  file  Is  the  main  'data  carrier', 
with  the  vertex  file  being  an  auxiliary  file  used  primarily  In  data  base 
construction  and  update.  The  triangle  file  is  a digital  representation  of 
the  triangulated  network  of  cells  which  is  constructed  to  'capture'  the  data 
of  Interest.  In  essence,  this  network  partitions  the  area  into  the 'least 
coBmon  denominator'  spatial  units,  such  that  each  triangle  is  homogeneous 
In  all  of  Its  attributes.  A triangle  side  marks  either  a change  in  local 
land  slope  or  a change  in  a data  attribute.  In  this  fashion,  all  of  Che 
Important  polygons,  ridge  lines,  streams,  etc.,  are  embedded  within  the 
triangle  network  as  triangle  sides.  Within  this  network,  triangles  and  vertices 
are  numbered  uniquely. 

The  triangle  file  represents  this  triangulated  network  by  maintaining 
a single  record  for  each  triangular  cell.  The  record  contains  three  distinct 
sections,  a topological  section,  a locational  section,  and  an  attribute  section. 

The  topological  section  for  each  triangle  contains  information  necessary  to 
define  the  neighborhood  relationship  of  the  triangle  grid — in  essence,  this 
constitutes  the  three  vertices  for  each  triangle,  and  the  three  adjacent  triangles. 
The  locational  portion  contains  centroid  coordinates  for  the  triangle,  coordin- 
ates of  the  vertices,  triangle  area,  and  magnitude  and  direction  of  the  gradient 
of  the  triangle.  In  a typical  application,  25  words  are  used  to  maintain 
the  topological  and  locational  information.  The  remainder  of  the  data  record 
Is  attribute  data,  organized  at  the  discretion  of  the  user.  Again  for  a 
typical  application,  25  words  of  attribute  data  are  made  available,  but  this 
can  be  increased  in  Increments  of  49  words.  Much  of  the  data  In  the  topological 
and  locational  portions  of  the  record  Is  'redundant'  in  that  It  Is  either 
stored  elsewhere  or  Is  directly  derivable  from  stored  data.  The  decision  to 
store  this  redundant  data  within  the  triangle  record  and  Incur  the  associated 
'overhead'  storage  charge  was  based  upon  the  desire  to  have  a user-oriented 
system  which  Is  easily  programmed,  and  on  the  relative  cost  breakdown  between 


54 


storage  and  access  costs  on  the  large  computers  on  which  the  ADAPT  system  in 
Implemented.  For  these  configurations,  it  is  preferable  to  have  all  the  informa- 
tion relating  to  a triangle  when  that  triangle  is  processed,  without  incurring 
additional  disk  reads  or  calculations.  In  other  configurations,  a reduction 
in  redundancy  might  be  indicated  (e.g.  for  a minicomputer-based  system).  The 
overhead  associated  with  the  strictly  necessary  data  storage  can  be  reduced 
to  four  words. 

Access  to  the  data  bank  is  normally  through  the  triangle  record,  l.e. 
given  that  1 am  in  a certain  cell,  what  are  the  attributes  of  that  cell. 

However,  a variety  of  other  access  methods  have  been  devised  and  incorporated 
into  the  ADAPT  system  over  the  years.  The  use  of  Inverted  files,  containing 
the  triangle  record  numbers  for  a given  attribute,  eliminates  much  searching. 
Inverted  files  for  any  attribute  are  easily  constructed  with  a single  scan  of 
the  data  base.  The  use  of  polygon  retrieval,  however,  has  largely  displaced 
the  use  of  Inverted  files.  In  polygon  retrieval,  a closed  vertex  chain 
(l.e.  a closed  boundary,  the  sides  of  which  are  all  triangle  sides)  is  identified 
with  a given  area,  as,  for  example,  a basin,  or  a damage  reach,  or  political 
jurisdiction.  This  polygon  is  identified  by  a retrieval  number,  and  stored 
In  the  polygon  file.  Whenever  it  is  desired  to  perform  an  operation  upon  the 
triangular  cells  within  that  polygon,  a simple  subroutine  call  retrieves 
the  polygon  chain  from  the  polygon  file,  dtermines  all  of  the  triangular  cells 
within  that  polygon  chain,  and  presents  a directory  of  the  record  numbers  to 
the  main  program.  The  advantage  of  this  form  of  retrieval  is  significant; 
attributes  within  the  ceils  may  change  over  time,  or  with  corrections  to  the 
data  base.  With  an  Inverted  file,  the  entire  file  would  need  to  be  scanned 
and  the  inverted  file  recreated — with  polygon  retrieval,  this  is  not  necessary, 
and  the  triangle  data  retrieved  is  always  the  most  current.  Additionally, 
Inasmuch  as  it  is  possible  to  update  a given  trlangulatlon  pattern  and  add 
mure  cells,  the  polygon  retrieval  will  automatically  recall  all  of  the  new 
cells  (provided  that  they  are  wholly  enclosed  within  the  polygon) . The  cost 
of  calcialations  associated  with  polygon  retrieval  is  low,  and  the  increase 
In  the  usability  of  the  system  is  quite  large. 

The  Inverse  of  polygon  retrieval  is  polygon  input.  Data  assignments  can 
be  made  to  all  of  the  triangles  within  a given  polygon,  allowing  for  simple 
updating  of  large  areas.  By  formatting  data  manipulations  into  input  polygons 
and  retrieval  polygons,  the  filling  of  the  data  bank  can  be  done  in  an  orderly 
fashion. 

Digitizer  software  developed  for  the  ADAPT  system  allows  the  triangulated 
network  cell  system  to  be  captured  in  a single  digitizing  pass.  The  associated 
polygons  can  similarly  be  captured  from  a digitizer.  Thus,  communication 
with  the  data  base  can  be  performed  in  an  almost  exclusively  graphical 
framework,  if  desired. 

It  should  be  noted  that,  although  the  system  operates  with  triangles 
as  its  basis,  and  programmers,  of  course,  must  be  cognizant  of  the  triangle 
cell  data  structure,  ADAPT  has  evolved  over  the  years  towards  a system  in 
which  the  triangles  become  essentially  transparent  to  the  user.  Communica- 
tion with  the  system  is  either  graphical  or  by  specification  of  polygon 
numbers  or  plane  coordinate  locations.  Only  in  the  correction/edlt  phase, 
and  in  certain  portions  of  the  data  insertion  phase,  need  the  user  be  concerned 
about  keeping  track  of  the  Internal  triangle  identifiers. 


55 


within  the  above  context  of  the  ADAPT  data  structure.  It  is  possible  to 
discuss  the  ramifications  of  the  variable  resolution  capacity  of  the  system. 

Based  on  our  experience,  variable  resolution  Is  an  extremely  powerful  adjunct 
to  the  manipulation  of  spatial  data — but  the  existence  of  such  a capacity 
creates  certain  problems  of  its  own.  Under  a fixed  resolution  approach,  only 
one  decision  need  be  made  after  assessing  all  the  facts — and  then  you  live 
with  It.  The  decision,  for  example,  to  process  all  data  at  two  acre  grid  cells 
Is  such  a binary,  one-time  decision.  With  a variable  resolution  system,  on 
the  other  hand,  one  is  continually  faced  with  decisions  as  to  information 
cost.  Information  value,  the  use  of  a given  kind  of  data  in  a particular 
declslon-maklng  process,  etc.  All  of  these  factors  are  simultaneously 
Integrated  into  the  decision  on  how  big  a given  cell  may  have  to  be  to  adequately 
(for  the  purposes  of  the  given  study)  define  terrain,  or  land  use,  or  soils. 

In  a moderately  sized  study  area  (6000  square  miles)  such  a decision  might 
have  to  be  made  45,000  to  50,000  times.  The  key  problem  with  a variable 
resolution  system  is  simply  that  because  It  is  theoretically  feasible  to 
store  data  at  the  finest  level  of  resolution  (even  if  such  data  Is  appropriately 
handled  at  grosser  levels) , and  because  of  the  tremendous  vested  Interest 
which  accumulates  around  certain  kinds  of  data  (land  use  in  particular), 
the  decisions  made  In  accord  with  a variable  resolution  capacity  become  overt 
assessments  of  quality/utility  of  various  forms  of  data.  This  is  not  a trivial 
consideration,  as  we  have  learned,  to  our  sorrow. 

The  essential  variable  resolution  capacity  of  the  data  structure  lies  In 
the  ability  of  the  triangles  to  be  modified  in  size  and  shape,  providing  that 
the  grid  construction  rules  are  followed.  Becasue  of  the  variable  cell  size, 
each  cell  must  carry  with  it  information  as  to  its  own  area  and  location  in 
space  (unlike  the  regular  grid  cell).  Given  the  range  and  variety  of  situations 
and  data  types  that  may  be  captured  by  the  triangle  grid,  it  is  dlfticult  to 
give  a hard  and  fast  estimate  of  the  relative  number  of  cells  necessary  to 
define  an  area  by  regular  grids  versus  triangular  cells.  Additionally,  the 
ability  of  the  triangle  sides  to  capture  the  polygons  defining  attribute 
boundaries,  means  that  the  triangles  themselves  can  provide  an  inherently 
superior  representation  of  the  actual  location  of  the  boundaries.  In  typical 
applications  to  date,  the  range  of  triangle  sizes  is  roughly  two  orders  of 
magnitude  around  the  mean  size.  In  one  such  application,  a county  of  roughly 
160,000  acres  was  represented  by  some  2,300  triangles.  The  data  attributes 
considered  in  the  triangulation  were  soils,  terrain,  future  land  use,  and 
hydrographic  boundaries.  Thus,  the  mean  triangle  size  was  some  70  acres, 
with  a range  from  less  than  ten  acres  to  greater  than  300  acres.  A utility 
program  within  the  ADAPT  system  provides  histograms  of  the  frequency  distribution 
of  number  of  triangles  by  area,  in  order  that  statistics  on  the  behavior  of 
the  variable  grid  for  different  situations  can  be  compiled  and  examined  over 
time. 


The  average  triangle  size  can  vary  significantly,  depending  upon  the 
nature  of  the  study  being  undertaken.  In  one  case,  a 1.6  square  mile  area 
was  examined  with  2,000  triangles  for  detailed  hydrologic  studies;  the  same 
area  was  also  captured  with  a grid  of  51  triangles,  for  other  purposes.  In 
a preliminary  analysis  of  sewer  design,  in  another  study  area,  a density  of 
trlangulatlon  of  200  triangles/square  mile  was  used  In  the  first  cut;  in  a 
second  round  design,  the  density  was  Increased  to  400  triangles/square  mile 
to  obtain  a better  delineation  of  terrain  variation.  The  triangle  size  range 
is  limited  only  by  availability  of  base  maps  at  the  appropriate  working  scale. 


and  by  curaputer  precision.  ADAPT  normally  stores  locational  Information  as 

decimal  decrees  of  latitude  and  longitude,  and  as  state  plane  coordinates 

in  feet.  UlRltlzer  software  Is  capable  of  resolvlnfi  to  within  ten  feel  on  the 
ground  from  a 7,5  minute  topo  map  sheet  source  document  using  a digitizer 
of  .001"  resolution.  The  system  can  also  be  set  up  to  use  an  arbitrary 
coordinate  system,  so  that.  If  desired,  even  detailed  'micro'  areas  could  be 
handled.  In  general,  however,  the  local  state  plane  coordinate  system  appears 
to  be  the  most  workable  coordinate  system  for  most  efforts. 

The  inherent  variable  resolution  capacity  of  the  triangle  framework 
provides  the  capacity  to  handle  much  of  tlie  data  necessary  In  physical  planifing 
environments.  The  most  variable  data  items  are  typically  terrain,  soils, 
and  demographic /land  use  data.  Terrain,  as  noted.  Is  easily  handled.  Detailed 
soils  surveys  can  be  input,  but  for  larger  areas,  soil  association  data, 

varying  with  less  frequency.  Is  adequate  for  many  purposes.  Future  land  use 

Is  generally  presented  as  a broad— brusli  picture,  and  Is  amenable  to  manipulation. 
Existing  land  use,  however,  is  frequently  much  more  detailed — in  some  situations, 
down  to  individual  house  locations.  For  certain  applications,  as  for  assessment 
of  flood  damage,  a good  picture  of  detailed  land  use  is  Important.  Unfortunately, 
detailed  land  use  data  Is  one  of  th  most  volatile  forms  of  spatial  data,  and 
is  often  out  of  date  as  soon  as  it  is  compiled.  The  ADAPT  system  wao  confronted 
with  the  problem  of  handling  detailed  land  use  data  in  an  application  Involving 
calibration  of  nonpoint  source  models  from  field  data.  A detailed  triangulation 
of  the  monitoring  sites  for  terrain  and  soils  was  available,  but  inclusion 
of  the  extremely  variable  existing  land  use  data  directly  within  the  triangula- 
tlon  framework  would  h.ave  required  an  excessive  number  of  triangles,  and  would 
have  been  of  limited  value  due  to  the  aforementioned  volatility  of  sucli  data. 
Accordingly,  a somewhat  different  approach  was  utilized;  tills  approach  provides 
a significant  second— level  variable  resolution  capacity,  over  and  above  that 
provided  with  the  triangles  themselves. 

In  normal  applications,  each  triangle  is  considered  to  be  homogenous 
In  all  of  its  attributes — thece  is  no  variation  within  a triangle.  To  handle 
the  detailed  land  use  data,  however,  this  restriction  was  relaxed,  and  land 
use  was  defined  as  actual  acreages  of  various  land  use  types  within  a given 
triangle.  That  is,  for  each  triangle,  a set  of  counters  ismalntalned — 
these  counters  accumulate  the  laud  use  area  for  each  category  of  interest. 

The  data  is  obtained  from  detailed  land  use  maps  which  have  been  digitized 
as  variably-sized  circles  of  a particular  land  use.  Software  then  determines 
the  relationship  of  the  digitized  detailed  land  use  element  (a  circle  of 
defined  area,  land  use  category,  and  spatial  location)  to  a pre-existing 
triangle  network,  and  accumulates  the  areas.  The  spatial  location  of  each 
detailed  land  use  element  is  known  during  the  processing,  but  is  not  currently 
retained  after  that  area  associated  with  that  element  has  been  assigned  to 
the  appropriate  triangle.  The  resulting  data  file  consists  of  triangle 
records,  in  which  certain  attribute  values  are  land  use  area.s  of  a given  type 
that  lie  within  that  triangle.  In  this  manner,  the  land  use  mix  for  a given 
triangle  does  not  have  to  be  estimated/assumed  from  a grosser  land  use  categor- 
ization, but  rather  can  be  directly  calculated.  Within  this  technique  of 
detailed  land  use  .'issigrvmcnt , a number  of  approaches  are  utilized  to  minimize 
the  search  associated  with  assigning  a land  use  element  to  a given  triangle. 
Primary  among  these  approaches  is  the  use  of  a polygon  as  a basic  unit  against 
which  detailed  land  use  might  be  digitized.  For  example,  a traffic  zone, 
pre-existing  .is  a polygon  in  the  polygon  file,  might  be  used  as  the  basin. 


57 


A background  land  use  of  lovr-denslty  residential  might  be  assigned  to  this 
polygon.  Software  would  then  retrieve  all  triangles  Included  within  the  poly- 
gon, and  assign  the  appropriate  background  value  of  low-density  residential. 

As  each  land  use  element  Is  processed.  Its  absolute  spatial  location  Is 
known  from  Its  digitizer  coordinates.  All  of  the  triangles  in  the  polygon 
are  searched  to  determine  which  contains  that  absolute  upatlal  location,  and 
the  land  use  element  is  then  assigned  to  that  triangle,  as  an  adjustment  to 
the  previously  defined  background  value.  This  form  of  processing  allows 
an  orderly  and  relatively  rapid  approach  to  what  otherwise  appears  to  be 
an  almost  intractable  probelm. 

As  noted  above,  the  spatial  location  of  the  land  use  elements  Is  avail- 
ablti,  but  Is  not  retained.  It  Is  possible,  however,  to  maintain  this  informa- 
tion In  a special  file,  for  later  use  in  such  situations  as  detailed  flood 
damage  calculations,  detailed  siting  studies,  etc. 

Thus,  there  are  two  basic  mechanisms  for  handling  variable  resolution 
data  within  the  ADAPT  data  structure.  One  is  the  use  of  variable  sized 
cells,  in  which  each  cell  retains  its  homogeneity  (e.g.  each  cell  would 
contain  area  of  one  and  only  one  land  use) . Tlie  other  approach  is  to  take 
advantage  of  the  homogeneity  wherever  possible,  but  also  to  allow  foi 
'inclusions'  within  the  individual  triangle.  Both  techniques  have  been  fully 
implemented  and  utilized  in  major  planning  efforts,  and  taken  together,  provide 
what  appears  to  be  a reasonable  set  of  tools  for  handling  most  of  the  kinds 
of  spatial  data  encountered  in  land-based  planning. 


DATA  FLOW  TO  HEC  PKOCRAMS 


The  ADAPT  system  is  organized  as  three  sets  of  interacting  components: 
the  data  base,  display  modules,  and  applications  modules.  Display  modules 
are  the  printer  and  pen  plotter  programs,  various  table,  lister,  and  report 
programs,  all  of  which  operate  as  essentially  general  purpose  programs  on 
any  ADAPT  data  base.  The  applications  modules  are  programs  or  sets  of  programs 
performing  a specific  design  or  planning  function.  Among  the  applications 
programs  are  modules  for  sewer  design,  land  application  analysis,  nonpoint 
source  analysis,  disaggregation  of  demographic  data  to  smaller  areal  units, 
etc.,  etc.  These  programs  similarly  interact  with  the  data  base,  and  may 
store  data  in  the  triangle  record  for  later  output  through  a display  program. 

In  certain  cases,  display  programs  or  applications  modules  are  obtained  from 
other  sources.  For  example,  three-dimensional  views  of  the  terrain  surface 
of  an  ADAPT  data  base  are  produced  using  the  SYMVll  program  of  the  Harvard 
Laboratory  for  Computer  Graphics.  In  such  cases,  rather  than  modifying 
these  outside  source  programs,  an  interface  which  extracts  data  from  the 
ADAPT  data  base  and  reformats  it  to  the  appropriate  input  format  for  the 
program,  is  used. 

Within  this  framework,  two  possibilities  for  connecting  the  ADAPT  data 
baae  to  the  HEC  progr.ams  exist,  as  noted  previously.  One  would  be  to  convert 
programs  HYDPAH,  DAMCAL,  and  HIA  to  applications  modules,  and  convert  GRIDPLOT 
to  a display  module  (or  utilize  the  existing  printer  plot  capacity  of  the 
ADAPT  display  modules,  revised  appropriately).  The  other  would  be  to  treat 
the  four  programs  as  'external'  programs,  and  write  Interfaces,  converting 


58 


triangle  cell  format  Jala  Into  regular  grid  cell  format  data.  The  approach 
that  Is  recommended  Is  a mix  of  these  two.  In  which  the  HYUl’AR,  DAMCAL,  and 
RIA  would  be  revised  to  work  directly  off  of  the  /IDATT  format  data  base,  and 
an  Interface  to  CRIDl’LOT  would  be  developed.  The  rationale  and  considerations 
for  this  approach  are  set  forth  below. 

The  triangiiKir  cell  .structure  utilized  In  AOAl’T  Is  similar  to  the  regular 
cell  currently  used  by  HKC  In  that  d.ita  Is  referenced  to  the  area.  As  noted 
previously,  each  triangle  record  c.iri  les  with  it  Infonuatlon  as  to  the  area 
referenced,  and  the  location  ol  the  area  In  space,  as  well  as  all  of  the 
data  attributes  .issoclated  wltli  that  triangle.  As  suclt,  ttio  file  can  be 
processed  in  a scquentitil  manner,  analogous  to  current  UF-C  processing  techniques, 
the  only  ejcceptlon  being  that  data  accumulations  would  have  to  be  based  on 
the  variable  cell  area.  Tims,  for  program  HYDl’AR,  the  data  flow  shown  in 
Figure  lV-2  of  Reference  2 could  be  followed  analogously.  Each  of  the  data 
elements  would  be  stored  in  the  d.ata  record.  The  sub  basin  hydraulic 
length,  which  is  an  exogenous  input  to  the  HEC  procedure,  could.  In  fact,  be 
calculated  from  the  AllAI’T  terrain  model,  as  could  many  other  hydrologic  para- 
meters related  to  basin  morphology.  The  flow  of  information  in  If.VMCAL,  shown 
in  Figure  V-2  of  Reference  2,  can  similarly  he  handled  In  a completely  analo- 
gous fashion,  but  here  some  of  the  distinctions  between  tlie  triangle  and 
the  regular  grid  cell  present  themselves.  In  paricular,  elevation  varies 
continuously  over  a trl.angle  cell.  Tims,  It  is  possible,  given  a reference 
flood  elevation  at  that  cell,  to  calculate  the  actual  flooded  area  for  the 
cell,  and  adjust  the  land  use  calcul.it  ions  accordingly.  As  noted  earlier, 
if  a spatial  data  file  of  significant  detailed  land  use  points  were  to 
exist,  it  is  a .simple  matter  to  determine  whether  or  not  any  of  these  points 
lies  within  the  flooded  area  of  a given  triangle,  and  Incorporate  this  more 
precise  delineation  of  land  use  into  the  procedure.  In  addition,  the  possibil- 
ity exists  of  direct  calculation  of  the  flood  elevations  using  the  terrain 
model  of  ADAl’T  within  HEC-l/HKC-2. 

For  the  RIA  program.  It  appears  that  processing  is  again  on  a cell  by 
cell  basis,  which  presents  no  problem,  llcncrat  ion  of  coincidents  matrices 
is  simply  handled,  but  must  be  adjusted  to  be  weighted  by  area  for  It  to  be 
meaningful.  Locational  attractiveness  on  a coll  by  cell  basis  is  again  a 
routine  operation.  The  only  potential  dltficulty  is  the  assignment  of  a 
•distance'  factor  to  each  cell,  as  appears  to  be  done  in  the  RIA  cx.unple. 
Distance  to  residential  land  use  is  noted  as  a cell  attribute,  used  in  the 
locational  attractiveness  calcuatlon,  hut  this  attribute  Is  not  cat.aloged 
as  belonging  to  the  basic  data  bank.  If  this  data  has  been  Input  exogenously 
to  the  data  bank,  then  the  ADAl’T  format  can  be  handled  analogously.  The 
calculation  of  inter-cell  distances  within  ADATT  is  not  as  straight-forward 
as  for  the  regular  cell,  because  of  the  variation  In  sh.ipe  ot  the  tri.augles, 
and  because  .my  triangle  may  have  a v.irlable  number  of  neighbors  (unlike  the 
regular  cell,  which.  If  completely  internal,  will  have  eight  neighbors). 

Thus,  RIA  calculations  based  on  Intcr-cell  distances  which  would  need  to  be 
calculated  Internally  within  the  data  file  would  require  some  development 
of  an  efficient  Inter-cell  distance  capacity  within  the  ADAPT  system. 

The  GRinri.OT  program  Is  not  described  in  detail,  but  appears  to  be  similar 
in  concept  to  SYILVP  In  that  It  produces  shaded  printer  plots,  apparently 
of  the  entire  data  flic,  with  certain  statistics  associated  with  each  plot. 

The  CRIDPLOT  program  appears  to  he  intlm.uely  tied  to  the  cell  size  used  In 


59 


the  data  bank.  ADAPT  has  previously  been  interfaced  successfully  to  SYMAP, 
and  a nunber  of  printer  plot  programs  exist  within  the  display  module  of 
ADAPT,  Accordingly,  no  insurmountable  problems  in  creating  an  interface  to 
GRIDPLOT  are  foreseen.  Further  ramifications  of  this  issue  are  explored  in 
the  following  section  on  computer  graphics. 

It  is  perhaps  appropriate  to  point  out  at  this  juncture  some  of  the 
essential  differences  between  the  triangular  cell  as  a cell  and  the  rectang- 
ular cell,  which  have  a bearing  upon  the  data  flow.  As  noted  earlier,  the 
existence  of  a triangle  itself  is  information-bearing  within  the  ADAPT 
system.  That  is,  a triangle  side  has  meaning,  and  triangle  vertices  and  sides 
are  points  of  relatively  high  information  content,  as  opposed  to  a set  of 
random  or  regularly  spaced  points.  The  study  area  to  be  incorporated  within 
a data  bank  can  be  seen  as  a continuum  or  surface  in  each  of  its  variables 
(in  the  real  world).  Short  of  explicitly  representing  this  surface,  any 
cell-based  representation  involves  some  smoothing,  and  hence  some  reduction 
in  the  high-frequency  variations.  With  a variable  resolution  system  and 
cells  whose  shape/ location  itself  is  information-bearing,  the  triangle  cell 
can  capture  more  of  the  high-frequency  variation  than  can  an  equivalent  number 
of  regular  cells.  If  data  stored  in  triangle  format  is  then  passed,  through  an 
interface,  to  a regular  grid,  some  smoothing  will  of  necessity  take  place, 
and  the  increased  information  content  associated  with  the  triangle  is  lost. 

If  a large  number  of  small  regular  grids  are  interpolated  from  the  triangle, 
again  the  value  of  the  triangle-based  system  in  limiting  the  number  of  discrete 
cells  that  must  be  dealt  with  is  lost.  The  loss  of  the  high-frequency 
information  in  this  smoothing  process  is  non-linear  in  its  impacts.  For 
example,  in  terrain  modeling,  the  high  slope  areas  will  generate  relatively 
more  nonpoint  source  pollutant;  development  costs  on  high  slope  areas  are 
significantly  higher.  Smoothing  of  this  terrain  through  either  regular 
grids  or  'large'  triangles  in  effect  filters  out  these  high  frequencies 
and  prevents  their  being  carried  forward  in  the  analysis.  Depending  upon 
the  process  being  analyzed,  this  effect  can  be  important. 

A further  feature  of  the  triangle  cell  as  opposed  to  the  rectangular 
cell  is  that  it  is  meaningful  as  a finite  element  for  such  processes  as  rain- 
fall-runoff generation.  A regular  grid  cell  is  not  unambiguously  defined  as 
a terrain  element,  and  flow  patterns  from  adjoining  cells  are  not  defined 
as  they  are  between  triangular  elements.  This  'finite  element'  behavior  of 
the  triangles  for  hydrologic  processes  allows  for  detailed  flow  routing  over 
each  element,  as  opposed  to  summarizing  all  of  the  element  characteristics 
for  a basin  and  performing  an  'average'  routing,  a procedure  which  again 
eliminates  the  important  high-frequency  variation  which  can  be  attributed 
to  small  impervious  surfaces,  high  slopes,  etc.  In  the  context  of  dynamic 
analyses,  it  is  meaningful  to  talk  about  mass  balances  on  a triangle, 
accumulation,  deposition,  depth  of  flow,  and  other  items  of  physical  signif- 
icance. Thus,  the  triangular  cell  becomes  more  than  just  a repository  for 
data,  it  becomes  an  analog  of  a real-world  terrain  element.  This  capacity 
of  the  approach  has  been  explored  in  only  limited  fashion  to  date,  primarily 
in  the  arena  of  modeling  of  rainfall-runoff-nonpoint  source  generation,  but 
it  suggests  the  possibility  of  more  physically-based,  descriptive  models 
of  these  processes  than  those  presently  available. 


60 


r 


COMPUTER  GR/\PH  I CS 


An  ADAPT  data  base  Is  not  structured  witli  regard  to  any  particular 
output  format.  Data  Is  stored  In  geodetic  coordinates,  not  in  relation  to 
a printer  character.  As  such,  the  ADAPT  data  base  is  essentially  'device- 
independent'.  Because  so  much  of  the  graphic  data  of  interest  to  planners 
is  mapped  line  data,  the  initial  implementation  of  display  modules  for  ADAPT 
centered  upon  the  use  of  the  pen  plotter  as  an  output  device  for  graphical 
output.  Capabilities  of  the  pen  plotter  graphics  display  modules  within 
ADAPT  include  polygon  plots,  contour  plots,  scatter  plots,  drainage  network 
plots,  base  triangle  map  plots,  all  at  continuously  variable  scale. 
Alternatively,  plots  can  be  auto-scaled  to  fit  within  a desired  window,  or 
both  a scale  and  window  can  be  specified.  The  vector  plotting  approach  can 
also  be  used  with  Tektronix  terminals  for  remote  graphical  access  to  the 
data  base.  This  line  plotting  capability  was  recently  augmented  by  a printer 
plot  capability.  The  printer  plot  capability  was  introduced  to  the  ADAPT 
system  to  provide  both  rapid  turnaround/low  cost  graphical  displays  of  the 
data  base,  for  checking  and  orientation  purposes,  and  to  provide  some  graphical 
capacity  from  remote  teletype  terminals.  Because  the  ADAPT  system  uses 
geodetic  coordinates,  these  printer  plot  programs  required  the  development 
of  an  interface  to  the  grid  matrix  of  printer  plot  positions.  One  approach 
uses  the  centroid  of  the  triangle  as  a locator,  and  the  printer  plot  position 
for  the  particular  character  associated  witti  the  data  element  to  be  plotted 
is  located  at  the  centroid  of  the  triangle.  That  is,  only  a single  character 
is  produced  for  each  triangle,  independent  of  the  size  of  the  triangle,  and 
that  character  is  at  the  triangle  centroid.  An  alternate  approach,  also 
implemented,  actually  plots  triangle  sides,  so  that  the  explicit  boundaries 
are  retained.  It  is  a simple  matter  to  modify  the  plot  programs  to  print 
multiple  points  for  each  triangle,  so  that  larger  triangles  will  fill  more 
of  the  spaces  in  the  printer  plot  gtid. 

Printer  plot  programs  currently  available  allow  for  variable  scaling, 
variable  line  spacing  on  the  printer,  and  selection  of  arbitrary  windows. 
Symbology  is  currently  'hard-wired'  in  at  61  characters,  and  there  is  no 
current  provision  for  overprinting;  these  two  restrictions  can  of  course  be 
relaxed  at  an  increase  in  processing  cost — programs  are  currently  set-up 
to  be  low  cost,  'quick-look',  rather  than  final  product. 

Perhaps  the  major  problem  with  printer  plots  interfaced  to  the  triangle 
network  is  simply  that  of  resolution  at  a given  scale.  Due  to  the  nature 
of  the  triangle  cells,  many  individual  tri.angles  may  fall  at  a single  plot 
position,  particularly  when  map  scales  are  large.  Short  of  defining  a special 
symbology  for  overprinting,  or  generating  maps  at  larger  scale  and  then 
photographically  reducing  them,  there  is  little  adequate  resolution  of  this 
problem. 

In  terms  of  interfacing  with  a program  such  as  GRIDPI.OT,  the  logical 
data  flow  would  require  that  each  triangle  be  either  sub-divided  into  grid 
cells,  or  smaller  triangles  be  aggregated  up  into  larger  grid  cells.  The 
problem  of  overlaying  an  arbitrary  regular  >;rid  coll  on  the  triangle  pattern 
has  been  solved  for  other  purposes,  .and  could  be  utilized  here  to  define  an 
'average'  situation  for  eaclt  grid  cell  to  be  plotted  by  printer  plot.  Ei  om 
this  orientation,  the  Interface  would  work  in  the  direction  from  the  grid 


I 


61 


i 

( 


r 


1 

E 


I 


I 


I 


i 

1 


!i 


cell  to  the  triangle  network — that  Is,  given  a grid  cell,  all  of  the  triangles 
falling  within  that  cell  would  be  determined,  and  the  appropriate  'average' 
area  or  characteristic  determined.  The  next  grid  cell  would  then  be  processed. 
This  Is  distinct  from  an  approach  which  would  process  the  triangles  sequentially, 
partitioning  them  into  grid  cells  or  aggregating  up  as  necessary,  which  could 
also  be  implemented,  following  upon  the  pattern  of  the  existing  ADAPT  printer 
plot  programs  with  the  modification  that  data  could  be  accumulated  in  a 
single  plot  cell  until  the  entire  file  has  been  processed,  at  which  time 
the  necessary  symbology  can  be  applied  to  the  cumulated  data  in  the  cell. 

One  additional  printer  plot  capacity  not  mentioned  above  is  the  opportun- 
ity to  display  numerical  values  associated  with  a given  triangle,  at  the 
centroid  of  the  triangle.  Thus,  a plot  of  the  actual  numeric  centroid  eleva- 
tions, or  soil  codes,  or  land  use  codes,  as  opposed  to  a range  plot  represented 
by  a symbology,  also  exists.  This  capability  is  implemented  with  a subroutine 
which  decodes  a given  number  in  the  triangle  record  into  separate  integers, 
and  then  determines  the  appropriate  plot  position  for  each  integer.  This 
approach  is  feasible  because  the  printer  plot  routines  of  ADAPT  set  up  a 
matrix  in  core  in  which  each  element  is  a print  position  which  is  filled 
at  fhe appropriate  plot  position  as  each  triangle  is  accessed.  Within  this 
framework,  it  is  possible  to  interrogate  the  matrix  as  to  its  current  value 
before  replacing  that  value  with  another  character,  thus  insuring  that 
overwriting  will  only  take  place  when  no  significant  information  will  be  lost. 


ADVANTAGES  AND  DISADVANTAGES  OF  PROPOSED  TECHNIQUE 

It  is  difficult  not  to  appear  biased  in  discussing  the  advantages  and 
dissdvsntsgss  of  ths  3dcp£lon  of  ths  A,DAPT  systsin  by  HEC,  in  view  of  the 
obvious  vested  interest  of  W.  E.  Gates  and  Associates.  WEG/A  is  firmly  of 
the  belief  that  ADAPT  represents  state-of-the-art  technology  for  manipula- 
tion of  land-based  data  in  conjunction  with  planning  and  design  process 
models,  and  has  in  fact  committed  much  of  the  very  existence  of  the  company 
to  that  proposition.  With  this  declaration  of  bias,  and  under  the  assump- 
tion that  the  proposed  solution  would  take  the  form  of  re-orientation  of 
the  existing  HEC  programs  to  work  entirely  off  of  an  ADAPT  data  base  as 
opposed  to  a regular  grid  cell  data  base,  the  following  advantages  and  dis- 
advantages (from  the  point  of  view  of  HEC)  may  he  identified: 

disadvantages 

. need  to  acquire  ADAPT  software 

. need  to  revise  existing  programs  to  conform  to  ADAPT  data  base 
system 

, need  to  learn  new  system 

. possible  Increase  in  system  overliead  associated  with  storing 
ADAPT  topological/locational  information 
. desirability  of  pen  plotter  and  digitizer  availability  in 

conjunction  with  ADAPT  may  require  additional  hardware  resources 
beyond  those  presently  available 
. additional  decision-making  and  data  evaluation  necessary  with 
variable  resolution  storage  capacity 


I 


I 


62 


advantages 

. system  provides  digital  terrain  model,  variable  re.solution 
capacity,  raeanlngtul  finite  elements,  polygon  manipulation 
for  both  data  input  and  retrieval,  geodetic  coordinates 
. system  has  proven  capability  to  manipulate  spatial  data  for 
large  areas 

. cell  orientation  means  tlxat  exlstiiv,  HEC  programs  working  off 
regular  grid  should  he  modifiable  without  excessive  difficulty 
, system  provides  for  direct  access  programming,  leading  to 

possibilities  for  Interactive  and  interactive  graphic  manipulation 
of  spatial  data  for  large  areas 
. variable  resolution  capacity  cuts  redundant  data  storage 
significant ly 

. existing  applicat ions/dlsplay  modules  of  ADAl'T  may  be  of  interest 
within  HEC  (nonpoint  source  analysis,  various  hydrologic 
estimation  programs,  sewer  design,  pen  plotter  routines,  etc.) 

. capacity  to  manipulate  embedded  drainage  networks  should  be  of 
value  to  other  HEC  efforts 

. ADAPT  system  is  ongoing,  evolving  system,  with  additional 

modules/capacities  being  added  by  WEG/A  in  many  areas  of  potential 
parallel  interest  to  that  of  HEC — thus,  opportunity  for  HEC  to 
obtain  additional  capacity  for  system  without  incurring  develop- 
ment costs 

. modular  nature  of  system  and  system  utilities  allow  for  easy 
Implementation  of  new  applications/display  modules 


CONCLUSION 

HEC  has  Invested  considerable  effort  In  automated  spatial  analysis 
techniques  to  date,  with  what  appear  to  the  outsider  to  be  at  least  two 
significant  benefits — the  recognition  of  the  appropriateness  and  utility  of 
coupling  models  and  spatial  data  bases,  and  the  introduction  of  spatial  data 
base  technology  as  a valid  and  usable  technique  for  studies  within  the  Corps. 
Having  made  these  strides,  the  next  levels  of  consideration  relate  to  those 
factors  associated  with  usability,  f le.xibility , efficiency,  and  cost  of 
such  systems.  As  those  of  us  who  have  worked  with  spatial  data  are  well 
aware,  it  is  almost  always  easy  to  do  something  in  concept — doing  it  in 
reality,  with  treanendous  pain,  Is  another  matter  entirely.  The  effort 
associated  with  producing  the  Trail  Creek  pilot  study,  though  not  indicated 
in  the  report,  must  have  been  considerable.  The  payoff  is  the  apparent 
acceptance  of  the  approach  within  the  Corps. 

We  know  from  our  own  expo  fence  that  such  acceptance  is  hard-bought. 
Initial  implementation  of  the  ADAPT  system  was  a massive  oftort,  involving 
much  hand  work.  Once  we  had  demonstrated  the  Icasibility  and  utility  of  the 
concept,  efforts  over  the  next  period  ol  yeai s were  devoted  towards  making 
It,  if  not  easy,  at  least  reasonable  in  terms  ol  eftort.  A parallel 
recognition  was  that  it  is  all  too  easy  to  devote  most  of  tlie  project 
effort  to  'managing'  the  data,  and  too  little  time  to  using  the  capacities 
of  the  models/spatial  data  base  to  gain  Insight  into  the  processes  under 
consideration.  Accordingly,  over  the  years,  the  emphasis  h.as  shifted  more 
and  more  towards  the  position  of  getting  to  the  point  of  h.aving  a data  base 


63 


to  wolk  with.  In  n rrasonabU*  t Imo  tramo  and  with  roasouablc  expenditure  of 
effort,  and  then  devoting  the  majority  oi  time  and  olfori  to  actually  working 
with  the  data  bane,  Hortlng  and  Billing,  testing  alternatives,  etc, 

HKC  lias  obviously  recognized  sixue  shortcomings  of  Its  existing  approach 
In  terms  ot  being  able  to  deal  with  largo  areas  with  little  pain.  The 
variable  resolution  solution  of  AilArr,  proposed  herein,  can  go  a long  way 
towards  lesst*nlng  the  pain,  shortening  the  time  necessary  to  get  the  data 
base  on  lino,  and  allowing  more  real  planr.lng/englneerlng  w(^rk  to  be  carried 
ovit . In  otir  pursuit  of  a system  with  lliese  capacities,  we  have  taken  a 
very  pragmatic  approach — "if  It  works,  use  it."  Thus,  the  WlAl'T  systcmi 
haa  Incorporated  bofli  cell  and  polygon  ap)iroaches,  without  entering  into 
interminable  arguments,  Olgitlzovs  and  pen  plotters  arc  \itlllzed  because 
they  are  cost-'cl  f ect  Ive . Vrlnter  plots  are  Included  because  they  are  also 
valuable.  Mini-computers  and  interactive  graphics  are  being  explored  with 
a view  to  making  the  system  more  etflcieni,  more  usable.  The  point  of  all 
of  this  Is  not  to  continue  to  praise  the  merits  of  the  ADAPT  systcan — rather. 
It  Is  to  suggest  that  there  is  real  I v no  single  host  solution  wltliln  the 
field  of  spatial  analysis — the  appropriate  mix  of  tools  will  always  contain 
a number  of  ways  of  doing  things,  and  this  mix  will  change  over  time, 
depending  upon  the  problem  at  hanu , the  cost  and  availability  of  hardware, 
and  tl>e  contliuiing  evolution  of  t!»o  tools  themselves,  A pragmatic  approach 
to  achieving  this  mix,  taking  advantage  of  the  availability  of  new  hardware/ 
software,  appears  to  bo  the  most  appropriate,  certainly  tor  those  with  an 
engineering  orientation,  where  the  solution  to  the  problem,  rather  than  the 
imposition  ot  a technology,  is  tl)c  goal. 


REFF.RF.N(.F.S 

1,  W,  F.  Gates  and  Associates,  "ADAPT  Handbook",  Fairfax,  Virgli\la,  1976. 

2.  Davis,  n.  U,  el  al,  "Phase  1 Oconee  Basin  Pilot  Stvidy  - Trail  Creek  Teat, 
Hydrologic  Engineering  Center,  U.  S.  Army  Corps  of  F.nglncers,  Davis, 
California,  September,  1975, 


Q.  Do  all  overlays,  for  various  qeographic  parameters,  need  to  be 
made  on  the  same  Basemap? 


A.  With  the  exception  of  land  use,  yes,  all  parameters  are  overlaid 
on  the  same  Basemap.  This  is  accomplished  by  direct  scaling  or 
photographically  enlarging  or  reducing  the  working  maps  to  the 
size  of  the  Basemap.  It  is  also  possible  to  rescale  the  triangular 
grid  plot  to  the  scale  of  the  working  maps.  Land  use  data  inserted 
in  our  'circle'  format  goes  in  from  whatever  scale  source  map  is 
available  - we  produce  a Cal  comp  triangle  map  at  that  scale  and  util- 
ize the  combination  in  digitizing. 


Q.  Does  it  require  a specially  trained  person  to  determine  the  tri- 
angular network  and  do  the  encoding? 

A.  Yes,  it  does  require  specially  trained  personnel  to  lay  out  the 

triangular  cell  network.  The  triangles  themselves  contain  signifi- 
cant geographic  information  by  the  very  nature  of  their  size  and 
location,  i.e.  triangle  boundaries  are  established  from  a detailed 
understanding  of  the  terrain,  soil,  land  use,  etc.  The  usual  rec- 
tangular grid  cells  do  not  require  this  additional  complexity  because 
the  cells  are  only  storage  locations  and  the  nature  of  the  cell  it- 
self does  not  contain  any  information.  Once  the  triangular  network 
has  been  established,  then  it  is  a straight  forward  encoding  job  to 
digitize  the  X,  Y,  Z data  for  the  vertices  of  the  triangles. 


Q.  Why  don't  you  capture  all  the  geographic  detail  of  your  triangle 
when  you  are  doing  the  terrain  assignment? 

A.  For  ease  of  encoding,  only  the  triangle  vertices  are  encoded  during 
the  first  pass.  Software  is  then  used  to  set  up  the  triangular 
grid  storage  system  and  additional  geographic  variables  are  added 
in  successive  passes. 


65 


Q.  Should  you  start  with  terrain  as  the  basis  for  your  triangulation 
and  then  subdivide  those  triangles  as  necessary  for  other  geographic 
variables? 

A.  Yes,  good  terrain  representation  is  the  foundation  of  our  data  bank. 


Q.  How  many  times  is  each  vertex  digitized? 

A.  As  many  times  as  there  are  triangles  in  common  with  it.  Software  is 
used  to  assign  numbers  to  each  vertex  in  subsequent  processing  of  the 
digitized  data. 


Q.  Are  there  any  automatic  checks  on  triangle  encoding? 

A.  No,  but  because  it  is  a consistent  set  of  information,  the  missing 
triangles  really  stand  out. 


Q.  How  do  you  number  the  triangles  and  what  information  do  you  store 
for  each  triangle? 

A.  The  triangles  are  numbered  sequentially  using  the  digitizer  during 
the  encoding  process.  The  X,  Y coordinate  location  of  each  vertice 
is  determined  by  the  digitizer  and  the  elevation  is  keyed  in  by  the 
operator.  The  identification  of  neighboring  triangles  is  calculated 
during  file  establishment  processing  and  then  retained  in  the  file. 


Q.  Have  you  used  any  statistical  sampling  techniques  to  determine  the 
value  of  cell  parameters  for  different  geographic  characteristics? 

A.  No,  usually  the  dominant  value  is  encoded.  Remember,  though,  that 

triangle  boundaries  are  originally  established  to  identify  differences 
in  geographic  characteristics. 


Q.  What  do  you  do  when  you  want  to  add  another  polygon  (geographic  data 
variable)  to  an  existing  triangle  network? 

A.  The  triangle  network  was  originally  established  to  represent  the 
variation  in  all  different  geographic  variables  of  interest.  If  a 
new  variable  is  to  bo  established,  the  easiest  method  would  be  to 
use  the  polygon  shapes  already  represented  by  the  existing  triangles. 
Additional  triangles  could  be  added  to  existing  network  and  this 
would  involve  an  updating  of  the  data  bank. 


Q.  Is  the  triangular  network  particular  to  a special  purpose? 


A.  Yes,  it  is  not  a generalized  geographic  data  bank.  The  triangles 
themselves  provide  information  by  their  size  and  shape.  The  em- 
phasis is  on  good  terrain  representation. 


Q.  How  can  you  add  more  detailed  land  use  information  to  an  existing 
triangular  network? 

A.  We  have  used  a percentage  composition  approach  to  capture  more 

detailed  land  use  information  within  a triangle,  e.g.  lOT.  industrial, 
30%  commercial,  20%  residential,  and  40%  developed  open  space.  Be- 
cause land  use  is  usually  so  irregularly  distributed,  the  percentage 
composition  approach  has  allowed  us  to  represent  this  data  and  still 
maintain  our  basic  triangular  network.  Another  solution  is  to  add 
more  triangles  as  previously  noted. 


Q.  Have  you  used  the  land  use,  soil  type,  surface  slope,  etc.,  character- 
istics of  the  triangles  in  direct  computation  of  rainfal 1 -runoff  from 
each  cell  and  then  route  that  water  downstream  from  triangle  to  tri- 
angle according  to  the  topography? 

A.  Yes,  we  have  simulated  the  rainfal l-runoff  process  using  the  kinematic 
wave  technique  on  a triangle  by  triangle  basis.  The  geographic  data 
in  the  triangles  has  also  been  used  in  an  aggregate  fashion  to  com- 
pute unit  graph  characteristics  of  subbasins.  We  have  found  the  unit 
graph  parameters  to  be  quite  sensitive  to  the  nonlinearities  associ- 


VARIABLE  GRID  RESOLUTION  - ISSUES  AND  REQUIREMENTS: 
THE  ADAPT  SOLUTION 

AUTHOR'S  ADDITIONAL  COMMENTS 


I have  no  major  revisions  to  make  to  my  paper  at  this  time,  I would 
like  to  re-emphasize  certain  of  the  attributes  of  triangles  that 
appear  to  have  become  somewhat  cloudy  in  the  discussion  of  the  past 
few  days.  Chief  amony  these  is  that  the  reduction  in  the  number  of 
cells  achieved  by  using  the  triangles  provides  for  direct  access 
programming  techniques.  That  is,  each  cell  record  is  immediately 
available  to  a processing  program.  This  allows  for  such  basic  pro- 
cesses as  profiling  (finding  the  terrain  and  attribute  path  between 
two  points),  autoi'iated  drainage  calculation  (through  investigation 
of  the  properties  of  adjacent  triatigles),  and  polygon  retrieval 
(specification  of  a chain,  and  retrieval  of  only  the  cells  lying 
within  that  chain).  We  anticipate  that  this  approach  will  further 
lend  itself  to  interactive  on-line  modeling,  and  use  of  minicomputer 
based  model ing/G IS  technology.  As  modelers,  we  have  found  a number 
of  areas  in  which  the  direct  access  capacity  significantly  enhances 
performance,  operating  mode,  etc.  In  essence,  it  is  no  longer  neces- 
sary to  take  a drink  of  water  from  a firehose  - rather,  the  data  base 
can  be  investigated  by  parts,  in  an  almost  browsing  manner,  which 
is  particularly  valuable  for  calibration  of  our  models.  Thus,  the 
variable  resolution  capacity  significantly  reduces  the  number  of  cells, 
making  an  alternative  storage  methodology  feasible,  and  providing  all 
the  attendant  advantages.  It  seems  that  1 did  not  adequately  stress 
this  aspect  in  either  my  paper  or  presentation. 

In  terms  of  apparent  drawbacks,  I think  that  most  are  logistical  and 
perhaps  institutional,  as  opposed  to  technical.  As  1 have  noted,  we 
tend  to  rely  upon  a digitizer/plotter.  1 think  that  these  are  valuable 
and  cost-effective  general  purpose  tools,  particularly  for  a shop  such 
as  HEC,  but  I do  recognize  the  situation  where  one  would  desire  to 
operate  with  hand-encoded  data  and  a printer  plot  capacity.  Inasmuch 
as  the  digitizer  processing  of  ADAPT  is  an  upgrade  of  our  formerly 
hand-encoded  procedures,  I see  a fairly  straight-forward  resolution 
to  the  encoding  problem.  Display,  on  the  other  hand,  is  still  desirable 
through  a plotter.  I can  see  the  possibility  of  enhancing  currently 
existing  printer  plot  software  to  make  it  more  usable  in  a "no  plotter" 
environment,  but  I am  not  sure  that  the  quality  of  the  product  would 
be  visually  acceptable,  although  certainly  technically  adequate.  It 


f)P 


is  more  an  issue  of  form  than  substance  here.  As  people  have  noted 
during  the  seminar,  construction  of  a triangle  grid  requires  some 
skill  and  training.  The  production  people  of  WEG/A  are  currently 
attempting  to  produce  a manual  of  procedures  and  training  program 
for  our  own  in-house  purposes  (to  be  able  to  hire  lower  cost,  lower 
skilled  technicians)  - again,  quality  control  procedures  are  being 
developed  for  this  same  purpose. 

Keep  your  options  open! 


VARIABLE  GRID  RESOLUTION 
HYDROLOGIC  ENGINEERING  CENTER 


By 

Jack  Dangeraond^ 
Raymond  Postma_ 
William  Hodson  Ph.D. 


I.  Introduction 

The  Hydrologic  Engineering  Center  (HEC)  has  requested  suggestions 
for  Increasing  the  efficiency  of  their  automated  geographic  data  system. 
Using  grid  cells  of  variable  size  for  areas  of  differing  resolution  has 
been  suggested  as  one  possible  solution  to  the  problem  of  data  manage- 
ment and  analysis.  This  method  is  possible  and  workable  however  it  would 
represent  major  changes  in  HEC  processing  procedures  and  software  to 
maintain  a multivariable  data  file  of  variable  grid  resolution. 

The  approach  taken  in  this  paper  is  to  describe  new  ideas  within 
the  context  of  existing  HEC  software  procedures,  and  to  concentrate  on 
the  design  of  new  file  structures  which  can  be  easily  Interfaced  with 
existing  programs.  With  these  policies  in  mind,  we  recommend  the 
following: 

1.  Keep  the  existing  software,  with  minor  modifications. 

2.  Use  data  storage  techniques  capable  of  interfacing  with  existing 
programs . 

3.  Increase  efficiency  by  using  data  compression  techniques  which 
approximate  variable-size  grid  cells  but  appear  to  the  software 
as  a uniform  grid. 

4.  Consider  the  possibility  of  doing  several  partial  studies  at 
different  grid  cell  sizes. 

This  paper  describes  the  background  for  these  recommendations,  and  iden- 
tifies several  alternatives  to  each  of  these  policies. 


director.  Environmental  Systems  Research  Institute,  Redlands,  California. 
2 

Programmer  Analyst,  ESRI,  Redlands,  California. 

^Environmental  Scientist,  ESRI,  Redlands,  California. 


70 


II. 


Data  Automation  and  Storage* 


The  area  where  it  was  felt  that  recommendations  could  best  be 
directed  dealt  with  the  storage  of  map  data.  In  order  to  best  understand 
the  data  storage  component  a brief  description  of  automation  currently 
practical  for  HEC  use  is  provided. 

A.  Data  Automation 

The  conversion  of  map  data  into  computer  readable  form  is  nor- 
mally done  by  one  of  two  techniques:  manual  recording  of  cell  data 
or  by  use  of  an  electronic  digitizer  measuring  x,y  coordinates. 

1.  Cell  Encoding 

The  practicality  and  cost  effectiveness  of  cell  encoding  is 
a subject  of  no  little  concern.  The  parameters  for  selecting 
a cell  size  and  technique  for  encoding  are  poorly  researched  and 
as  data  is  recorded  on  these,  statistics  will  provide  meaningful 
input  in  Che  design  of  data  structures.  We  do  know  certain 
Information.  For  example,  when  manually  coding  continuous  data, 
such  as  elevations,  the  cell  by  cell  recording  technique  appears 
to  be  more  efficient  and  accurate  than  recording  continuous 
lines  and  Interpolation  into  a cell  structure.  However,  if  very 
accurate  values  are  not  required  on  a large  portion  of  the  study 
area,  a more  dispersed  sampling  could  reduce  the  coding  effort. 

If  the  sampling  were  done  on  an  equal  Interval  basis,  for  example 
every  other  column  on  every  other  row,  the  computer  processing 
would  be  reduced.  A surface  mining  operation  in  Australia 
using  ESRI  software  achieved  very  acceptable  results  with  this 
approach.  The  sampling  technique  in  this  example  would  reduce 
the  coding  effort  by  a factor  of  four  in  those  areas  where  this 
resolution  is  adequate.  By  doing  every  fourth  column  on  every 
fourth  row  the  coding  effort  would  be  reduced  to  one-sixteenth 


There  are  two  general  categories  of  technology  associated  with  auto- 
matlve  map  recording,  .inalysis  and  display  which  have  been  the  principal 
focus  of  ESRI  software  development  project  work  and  information  systems. 
These  are  grid  cell  and  x,y  coordinates.  These  two  categories  have 
unique  automation  techniques,  analysis  procedures,  output  types,  etc., 
and  although  separate  also  h.ivc  linkage  which  allow  them  Co  be  interfaced 
as  hybred  systems  for  special  applications. 

In  an  ideal  environment  (l.e.,  experienced  people,  computer  hardware, 
time,  etc.)  an  analyst  might  use  a combination  of  these  Cools  which  would 
be  extremely  sophisticated.  Appendix  A describes  a variety  of  available 
software  and  how  it  can  be  linked  together  to  create  an  integrated 
system.  However,  the  resources  necessary  for  taking  advantage  of  these 
cools  may  not  (at  least  in  the  short  run)  be  available  or  practical  to 
the  users  of  the  HEC  system. 


71 


of  the  effort  of  doing  every  cell. 

The  point  Is  that  the  mathematics  of  surface  sampling  and 
rough  abstractions  are  a major  consideration. 

Also,  wlien  manually  coding  polygon  line  and  point  data  our 
experience  has  been  that  some  type  of  data  compression  is  usually 
more  efficient  than  coding  every  cell.  ESRI  has  Implemented  a 
two-field  per  record,  run-encoded  technique  which  uses  the  value 
of  a variable  and  the  rightmost  column  for  which  that  value  applies. 
This  technique  is  based  on  one  documented  by  Dr.  Kenneth  Dueker, 
which  has  been  modified  and  includes  special  control  records. 

This  has  greatly  reduced  the  manual  coding  effort.  By  including 
a feature  to  duplicate  all  or  part  of  the  previous  row,  the  coding 
effort  becomes  quite  small  for  what  conceptually  approaches  a 
variable  grid  cell.  The  actual  grid  cells  are  the  same  size, 
but  contiguous  identical  cells  are  grouped  and  coded  together. 

By  using  these  coding  techniques,  it  does  not  seem  necessary  to 
develop  some  type  of  variable  grid  cell  technique  in  order  to 
increase  coding  efficiency. 

2.  X,Y  Coordinate 

Finally,  the  coding  process  is  not  affected  by  the  final  grid 
cell  size  if,  instead  of  manual  coding,  a digitizer  is  used  to 
code  the  areal,  lineal  and  point  data.  This  alternative  requires 
a shift  to  using  x,y  coordinate  recording  rather  than  grid  coding, 
and  allows  the  creation  of  several  grid  data  banks  of  differing 
size.  Software  to  efficiently  create  and  analyze  data  in  an  x,y 
coordinate  form  is  developed  and  has  been  in  use  for  several 
years  (see  Appendix  A).  Since  HEC  has  traditionally  used  grid 
files  in  the  analysis  and  display  portions  of  Its  system,  this 
paper  considers  only  grid  systems.  ESRI  normally  maintains 
separate  grid  data  files  created  from  digitized  data,  due  to  the 
effort  and  expense  of  converting  to  grid  form  for  each  analysis. 

B.  Data  Storage 


In  a non-compressed  format  the  amount  of  data  storage  increases 
relative  to  the  Inverse  square  of  the  length  of  a grid  cell  side. 

By  making  cells  200  feet  by  250  feet  (1.14  acres)  Instead  of  800  feet 
by  1000  feet  (18.2  acres),  sixteen  times  as  much  storage  is  required. 
On  many  computer  systems  the  amount  of  dat.a  storage  required  for  the 
spatial  data  file  becomes  the  most  Important  factor.  This  is  due  to 
two  factors:  The  first  is  the  cost  of  storage,  especially  on  a data 
file  that  is  used  over  a long  term  where  storage  cost  may  exceed 
processing  cost.  The  second  factor  is  th<'  access  time  often  required 
for  use  of  large  data  files.  Many  systems  require  large  data  files 
to  be  off-loaded  to  tape,  and  these  files  need  to  be  reloaded  to 


72 


r 


i on-line  storage  before  analysis  can  occur.  This  transfer  may  not 

occur  for  several  hours  after  a request  Is  placed  for  use  of  a file, 
i Also,  the  transfer  cost  may  be  considerable  (as  high  as  $50  to  $100 

on  some  installations).  Finding  ways  to  store  data  in  a compact 
I form  is  therefore  an  important  requirement  for  increasing  the  effi- 

j’  cicncy  of  a system. 

j*  Grid  data  can  be  stored  either  as  a separate  file  for  each 

1 variable  or  as  one  multivariable  file.  For  single  variable  files, 

a file  structure  similar  to  the  coding  layout  usually  requires  the 
minimum  amount  of  space.  For  continuous  or  highly  heterogeneous  data, 
the  standard  run-encoded  data  structure  can  require  twice  as  much 
data  storage  as  an  uncompressed  form,  because  column  location  must  be 
Included  for  each  value.  There  are  two  ways  to  handle  this  problem. 

The  first  is  to  place  these  data  in  an  additional  file  structure 
: which  is  not  run-encoded.  The  second  alternative  is  to  add  a feature 

' which  allows  switching  modes  between  run-encoded  and  uncompressed 

data  stored  in  the  same  file.  When  in  the  uncompressed  mode,  the 
column  field  would  be  used  to  store  data.  This  mixed  data  type  scheme 
I appears  to  be  feasible  if  additional  logic  is  added  in  the  I/O 

I routines  to  enable  and  disable  the  compression  mode. 

j ESRI  software  norm.illy  converts  character  data  to  binary  form 

. for  faster  processing.  However,  on  some  computers,  such  as  the  system 

I at  Lawrence  Berkeley  Libs,  the  storage  requirement  for  the  binary 

file  is  up  to  five  times  greater  than  the  character  file.  In  this 
case,  the  trade  off  between  increased  processing  efficiency  of  the 
' binary  file  and  the  increased  data  storage  cost  must  be  ev.aluated  to 

I find  the  most  cost-effective  approach.  "Bit  packing"  of  the  binary 

' data  for  the  LBL  system  is  a third  alternative,  but  the  cost  of 

I packing  and  unpacking  the  data  may  prove  to  be  as  costly  as  using 

' a character  data  structure. 

ESRI  has  purposely  developed  Its  software  in  a modular  form  so 
that  the  external  d.at.a  structure  is  nearly  transparent  to  the  analysis 
part  of  the  program.s.  A simple  replacement  of  the  input/output 
subroutine  library  allows  the  existing  progr.ams  to  use  a different 
data  structure.  One  other  advantage  of  the  modular  approach  is 
th.at  groups  or  blocks  of  data  can  be  accessed  by  the  user  program  and 
separated  for  analysis  on  systems  whore  this  process  is  not  .ivailable 
or  is  done  Inef f ic lent Iv.  This  increases  the  efficiency  of  trans- 
ferring data  from  storage  to  analysis. 

If  only  a few  variables  arc  an.ilvzed  .it  the  same  time,  it  should 
be  possible  lo  nt.iin  them  as  singh-  variable  files.  Li-aving  ihi' 
data  a.s  sep.ir.ite  files  for  analysis  i>ffers  si'voral  advantages.  The 
first  is  that  no  aildltional  effort  is  requlrc-d  to  convert  thi-  data  I rom 
the  coding  process  into  a multivariable  form.  The  second  a<lv.>ntage 
is  that  the  storage  sp.ice  required  by  sever.il  separate  files  is 


73 


1 


I 


I normally  considerably  less  and  never  greater  than  any  composite 

multivariable  file  that  could  be  formed.  The  reason  composite  files 
usually  require  more  space  is  that  a third  dimension  (variable 
position)  must  be  included  in  the  multivariable  file  structure. 

The  third  advantage  is  that  only  the  variable (s)  needed  for  a given 
analysis  need  be  on-line.  It  is  often  possible  to  retain  a few 
1 single  variable  files  on-line  for  several  days  in  situations  where 

I the  composite  file  is  so  large  that  it  must  be  stored  off-line  on 

i tape.  Therefore,  when  using  few  variables,  it  is  most  efficient 

I to  use  single  variable  files. 

j:  However,  there  are  situations  where  it  is  not  possible  or  effi- 

cient to  work  with  separate  single  variable  files.  This  occurs, 

I for  example,  when  the  number  of  files  needed  for  a given  analysis 

exceeds  the  capacity  of  the  computer  system  to  process  them.  We 
have  found  that  on  most  systems  five  files  is  a reasonable  maximum 
before  conversion  to  a composite  multivariable  file  is  required. 

This  number  is  not  an  absolute  limit.  We  have  worked  with  small 
systems  which  could  not  accept  more  than  four  files,  and  other 
systems  on  which  as  many  as  fifteen  files  have  been  used.  It  is 
sometimes  difficult  for  the  user  to  reference  more  than  one  or  two 
files  in  a given  analysis  resulting  in  multivariable  files  being 
created  unnecessarily.  If  there  are  only  a few  files  that  need  to  be 
used  for  a given  analysis,  attention  should  be  given  to  simplifying 
user  access  to  the  required  files,  rather  than  creating  a multivariable 
i file  with  greater  storage  requirements. 

If  a multivariable  file  is  required,  more  efficient  techniques 
for  its  storage  are  possible  than  those  which  are  normally  used.  The 
standard  multivariable  file  has  the  row  number  in  the  first  field  of 
each  record,  the  column  number  in  the  second  field,  and  then  the  data 
values  in  the  remaining  fields.  While  some  savings  could  be  made 
I by  dropping  the  row  and  column  fields,  the  resultant  file  is  much 

less  flexible.  The  simplest  way  to  decrease  storage  requirements  on 
this  file  is  to  add  a third  field  which  would  give  the  final  column 
for  which  all  the  data  items  apply.  This  allows  one  stored  record 
to  account  for  all  sequential  cells  which  contain  the  same  data.  A 
new  multivariable  record  is  stored  whenever  a variable  changes  in 
value.  This  teclmique  appe.irs  well  suited  to  situations  where  all 
the  data  variables  are  forced  to  be  homogeneous  over  a sequence  of 
several  cells,  such  as  is  the  case  in  a variable  grid  cell  type 
approach.  If  this  technique  is  used,  all  continuously  changing 
variables  and  all  highly  heterogeneous  variables  must  be  excluded 
and  remain  as  separate  single  variable  files  or  no  compression 
can  occur. 

Another  file  structure  which  appears  to  offer  more  flexibility 
and  has  potential  for  more  storage  efficiency  uses  a technique  which 
considers  each  variable  independently  and  only  creates  a record  if 


74 


a change  from  the  previous  cell  occurs  for  that  variable.  Each 
data  record  has  two  fields  rather  than  a complete  multivariable 
entry.  Ttic  first  field  indicates  the  variable  posi t ion  on  a standard 
multivariable  file  and  the  second  field  has  the  new  value  for  that 
variable.  This  value  is  used  tintll  another  change  occurs  for  that 
position.  If  any  variable  changes  between  cells  a spatial  identification 
record  precedes  the  change  which  indicates  Che  column  position 
(normally  2)  and  Cite  column  number  where  Che  data  change  occurs. 

This  data  structure  can  be  used  on  any  mix  of  heterogeneous  and  homo- 
geneous data-,  but  it  is  still  best  to  leave  continuous  data  as 

separate  non-compressed  files  if  possible,  because  it  changes  with 
each  cell,  thus  requiring  a new  record  for  each  cell. 

j A current  example  illustrates  the  significance  of  data  compression 

I to  increase  efficiency  of  a system.  Hitch  Modeleski  of  the  Association 

j of  Bay  Area  Governments  (ABAC)  has  evaluated  their  data  files  to 

I determine  the  space  savings  from  data  compression.  He  found  that 

I significant  space  could  bo  saved  by  the  methods  suggested  here.  He 

|,  also  noted  that  a point  of  diminishing  returns  was  reached  where  the 

■ efficiency  further  data  compression  was  offset  by  the  system  require- 

1 ments  for  accessing  and  manipulating  the  stored  data.  Table  I 

^ lists  the  variables  that  are  stored  by  ABAG,  using  the  single  variable 

1 run-encoded  technique  described  earlier.  The  matrix  size  is  450  rows 

■'  by  420  columns  for  a total  of  189,000  cells.  Each  cell  covers  500 

i meters  by  500  meters. 


Table  I 

Variable 

Records 

Cells/Run 

Classif ication 

Slope  Stabllitv 

19,722 

9.81 

4 

Geology 

11,333 

17.38 

6 

Precipltat  ion 

15,903 

12.24 

28 

Fault  Zones 

4,023 

53.00 

56 

Coast  Line 

3,074 

72.22 

4 

Flood  Zone 

5,179 

40.03 

2 

Well  Yield 

7,346 

27.44 

4 

Soil  Type 

15,655 

12.60 

135 

t 

I 


The  ABAC  data  differs  from  HEC  data  in  several  ways.  The  cell 
size  Is  62  acres  compared  to  the  1.14  acres  used  by  HEC  in  its 
smaller  grid  cell.  It  follows  that  the  number  of  cells  per  run 
would  be  considerably  higher  for  a smaller  grid  cell  size.  For 
the  same  number  of  classifications  the  runs  may  be  seven  to  eight 
times  longer.  Using  the  smaller  factor,  the  data  storage  savings 
by  using  compression  would  range  from  35  times  less  space  for  slope 
stability  to  250  times  less  space  for  the  coast  line  variable, 
because  only  two  fields  are  needed  to  define  a run.  Even  if  a more 
conservative  factor  is  used,  it  is  obvious  that  the  data  storage 
reduction  possible  with  this  method  is  considerable. 

A study  that  ESRI  is  presently  working  on  in  northern  Venezuela 
has  files  for  most  variables  at  two  different  grid  cell  sizes.  The 
data  was  initially  converted  from  polygon  form  to  grid  cells  with 
66.67  meter  sides.  Then  3 by  3 groups  of  cells  were  aggregated  into 
larger  cells  with  200  meter  sides.  This  technique  guarantees  that 
the  primary  occurrence  in  a large  cell  is  not  replaced  by  a secon- 
dary feature.  It  also  allows  for  the  secondary  feature  to  be  retained 
as  a separate  file  at  the  larger  grid  cell  size.  The  66  meter  grid 
cell  file  has  767  rows  and  766  columns  for  a total  of  587,522  cells, 
while  the  200  meter  grid  cell  file  is  256  by  256  or  65,536  cells. 

Table  II  compares  five  of  the  variables  in  the  grid  cell  file.  All 
variables  are  areal  type  data  except  infrastructure. 

ii 

D 


Variable 

Table  II 

66  Meter  Cells 
Records  Cells/Run 

200  Meter 
Records 

Cells 

Cells/Run 

Classi- 

fications 

Political 

Boundaries 

3,410 

172.3 

1,129 

58.0 

6 

Agricultural 

Districts 

3,162 

185.8 

1,040 

63.0 

4 

Traffic  Zones 

2,355 

249.5 

787 

83.3 

2 

Infrastructure 

83,963 

7.0 

20,398 

3.2 

127 

Visual  Resources 

10,028 

58.6 

3,285 

20.0 

19 

1 


76 


For  luf rascructurf  a socondary  file  of  4246  records  and  a ter- 
tiary file  of  549  records  were  also  created  at  the  200  meter  grid 
cell  size.  Elevation  data  was  manually  coded  at  the  200  meter 
cell  size  only.  Table  III  shows  the  number  of  records  required  to 
store  the  data  in  standard  run-encoded  form  when  llie  elevation 
readings  were  rounded  to  some  multiple  of  meters. 


Table  III 

Rounding  Elevation  to  Nearest  N Meters 

N 

Run-encoded 

Records 

Cel  Is /Run 

Average 

Elevation  Change 

1 

49569 

1.32 

12.27 

2 

42718 

1.53 

13.87 

5 

32792 

2.00 

18.98 

10 

26415 

2.48 

24.41 

20 

20065 

3.26 

33.44 

40 

14286 

4.59 

50.49 

100 

6119 

10.70 

117.28 

The  study  area  involved  has  a mixture  of  flat  marsh  land  In  one 
portion  and  very  rugged  mountainous  terrain  in  another  portion.  By 
rounding  to  a large  constant  multiple  li>r  the  whole  file,  detail  in 
the  m.irsh  land  Is  lost  without  affecting  the  mount.alnous  .irea.  This 
Is  normally  the  reverse  of  what  Is  desired  in  a HEC  flood  control 
study. 

It  appears  that  elevation  does  not  lend  Itself  to  run-encoding, 
so  that  the  uncompressed  mode  technique  mentioned  earlier  ni.iy  be 
especially  suitable  tor  this  variable.  Other  continuous  d.ar.i  may, 
however,  be  rounded  and  still  retain  its  usefulness.  One  such 
variable  is  topographic  slope.  Table  IV  shows  the  data  compression 
possible  by  rounding.  The  slope  file  used  was  created  by  taking  the 


77 


groatost  cliango  In  vnUin  t'nr  onoh  coll  In  the  elevation  file. 


Table  IV 

Slope  Rmiiulod  to  Neiirost  N roreont 

Run-oucoded 

Average 

N 

Records 

Cel 1 s/Kun 

l)i  f f orouco 

1 

35286 

1.86 

2.8 

2 

26420 

2.48 

5.0 

A 

18023 

3.64 

7.0 

8 

11377 

5.74 

10.2 

Spec lal* 

17899 

3.66 

- 

•Slope  Grouped  O-l,  2-4,  l-g,  10-14,  15-24,  :5-3h. 

40-59,  601 

The  appcl.al  c.tae  of  uneven  grouping  rcunlreil  .alight  ly  less 
storage  than  the  rounding  to  t>  percent  even  though  the  information 
retained  should  he  much  more  tis.thle.  The  precipitation  variable  In 
the  AtlAO  study  also  used  a similar  grouping  technlipie. 

Another  ftt'm  worth  comment  Is  th.it  several  varl.’ibles  In  the  AltAC 
and  Veneztiel.'i  files  have  hi'en  represent  eil  In  .ati  .areal  I orm  r.athi't 
than  In  a continuous  form.  Those  Include  slope  stabllltv,  precipi- 
tation .and  .slope  categories.  for  many  analyses  this  .approach  Is 
acceptable,  and  the  data  stoiage  savings  Is  very  great.  The  IlKC 
DAMCAI.  progr.am  renulres  I he  variables  to  be  In  continuous  form,  but 
thla  does  not  preclude  u.slng  the  s.ame  variables  In  .a  pseudo-.area  1 
form  for  the  RIA  progr.am. 

In  order  to  m.alntaln  a data  file  compatible  with  the  present 
software,  and  at  the  same  time  achieve  data  compression  which  approxl- 
■lates  a v.ariahle  grid  size,  we  recomnend  the  lollviwlng  types  of 
data  storage: 


7fl 


PT 


1.  A sliiKle-varl.iblo  form 

c "vl 

wluTo  V • Valuo. 

c • Klp^litmojiC  colufim  of  row  to  use  value  V. 

Lefluiost  columu  to  apply  V is  one  greaLor 
than  column  spec  1 tied  in  previous  record. 

Special  records  usitip,  iie^iat  Ive  c values  lor 
new  rows  and  other  controls. 

For  switching  to  uncompressed  data  UK^dc  a special  ne>?al  ive 
value  could  he  used.  Retv.ru'in^  to  run-encoded  mode  would  he 
automatic  after  columu  c value  is  read. 

2.  In  mul t Ivar iahle  form 

; cziij 

J]  whore  p “ Position  In  multivariable. 

' V “ Value  for  this  position. 

I Value  applies  until  next  record  with  same  position. 

I New  row  when  p “ 1. 

' New  column  when  p “ 2. 

' No  column  record  If  no  changes  in  other  positions. 

• In  addition,  .seriou.s  con.siderat  ion  should  bo  given  to  analysis 

I of  windowed  areas.  K.itlu-r  than  worklnv.  through  the  total  data  file 

when  a particular  area  l.s  being  studied,  a single  program  pass  can 
produce  a separate  data  file  fi>r  |ust  tlie  study  areti.  This  l.s  a very 
j efficient  teehnUpie  wlu'ii  data  Is  stored  for  a large  area.  The  tot.al 

data  bank  can  then  be  stoiei!  olf-llne  .ind  only  the  areas  of  concern 
used  on- 1 1 ne . 


I 

I 


III.  Data  Plow  J o P n^rjiins 

Due  to  the  modul.ir  appio.ich  proposed  hei  e,  stored  d.it  a can  be  made 
compatible  with  the  existing  programs  hv  repl. icing  the  Input /output 
(I/O)  library.  The  flow  of  data,  then,  would  proeeisl  in  tho  s.ime  maimer 
as  It  now  does.  To  Inerease  efficiency,  the  KIA  program  should  be  changed 
to  accept  data  In  the  form  of  several  single  variable  tiles.  In  addition 
to  a multivariable  tonn. 

The  data  flow  routine  can  be  structured  so  that  blocking  and  deblocking 
occur  efflcicMitly.  That  Is,  blocks  or  groups  of  variables  .ire  tran.s- 
ferred  from  storage  to  the  program,  rather  than  each  variable  being  located 
and  transferred  sep. irately.  The  blocks  of  data  arc  separated  (deblocked) 


79 


w 


for  analysis  so  that  the  priM-.iMin  ron'lvos  tho  ilat.i  In  the  same  manner 
as  it  does  at  the  present  time. 


IV.  Graphics 

The  data  would  continue  to  exist  In  .1  uniform  cell  sice  structure 
and  would,  therefore,  be  compU-iely  compatible  with  existing  display 
techniques. 


V . AdvantaKes 

There  are  three  significant  advantages  to  using  a system  of  run- 
encoded  data. 

1.  Less  storage  space  Is  required  due  to  the  compression  achieved 
by  this  method.  Additional  storage  space  can  be  saved  by  using 
single-variable  files  whenever  possible. 

2.  The  run-encoded  analysis  can  he  structured  sucli  that  analysis 
occurs  only  when  a v.irl.ible  ch.inges  v.ilue  between  two  cells. 
Efficiency  of  analysis  Is  Increased  bv  this  method,  because  the 
first  cell  with  a now  value  Is  aiialvzed  and  siihseqiient  homo- 
geneous cells  need  not  be  analyced  separately.  A variable  which 
Is  homogeneous  throughout  the  studv  area  need  not  be  analyzed, 
while  a continuous  variable  would  be  analyzed  for  each  grid  cell. 

3.  Because  only  those  grid  cells  with  new  values  .are  read  Into  the 
program,  fewer  records  are  processed  than  at  the  present  time. 
This  Increases  the  over.ill  speed  of  processing. 

Two  disadvantages  may  result  from  adoption  of  this  system: 

1.  Modification  of  softw.ire  Is  required.  .This  d 1 s.iilvantage  Is  minor 
compared  to  .ilternat  ive  systems  which  require  entirely  new  soft- 
ware packages. 

2.  A slight  Increase  In  proeesslng  may  result  from  the  routines 
which  convert  flies  to  iulerii.il  form.  Ag.iln,  Increasing  storage 
efficiency  must  be  wclglu’d  against  deere.ised  elllclency  ot 
handling  compressed  data. 


t. 


VI . Cone  tuslon 

The  procedures  recommended  here  allow  llKC  to  maintain  Its  present 


80 


software  with  minor  modifications.  Significant  lncrea.se.s  in  efficiency 
can  be  obtained  by  variable  sampling,  coding  and  storage  techniques 
without  revising  the  entire  system  to  use  a variable  grid  cell  size. 

If  variable  grid  size  is  desirable,  different  files  for  different 
areas  should  be  considered,  using  Che  present  system. 

Just  as  Detroit  has  found  that  for  the  short  term  it  is  easier  to 
rework  the  internal  combustion  engine  than  to  produce  a completely 
new  scheme,  so  the  problems  associated  with  a uniform  grid  size  can  be 
minimized. 


VII.  References 


Modeleskl,  Mitch,  "The  Basis  Papers",  Berkeley,  ABAC,  October,  1976. 

Dueker,  Kenneth  J.,  "Geographic  Data  Encoding  Issues",  Technical 
Report  #43,  Institute  of  Urban  and  Regional  Research,  University 
of  Iowa,  June,  1975  (revised). 


81 


APPENDIX  A 

DESCRIPTION  OF  F.SRI  SOFn-.'ARK  FOR  X.Y  GEOGRAPHIC  ANALYSIS, 
CONVERSION  TO  GRID  DATA  AND  GRID  GEOGRAPHIC  ANALYSIS 


This  section  describes  the  general  software  systems  developed  by  ESRl . 

The  initial  material  in  section  1 presents  an  overview  of  geoeoding  techniques 
and  software  systems.  This  overview  is  followed  by  a presentation  of  the 
PIOS,  GRIPS,  and  GRID  programs. 

1.  Software  System  and  Geocoding  Techniques  Overview 

As  stated  in  the  report,  ESRI  has  developed  software  which  utilizes 
geocoded  x,y  coordinate  data.  This  method  of  geocoding  involves  detailed 
coordinate  recognition  of  map  features  which  are  spatially  Identified 
as  polygons,  lines  and  points.  This  approach  requires  that  geographic 
data  be  digitized  and  stored  according  to  x,y  point  measurements.  In 
case  of  polygons  and  lines,  this  requires  a series  of  points  defining 
the  perimeter  of  each  polygon  or  variation  of  each  line  segment.  Single 
points  are  referenced  by  a single  measurement. 

The  x,y  coordinate  technique  maintains  the  integrity  of  the  original 
data  by  storing  the  precise  points  which  represent  variations  of  the 
geographic  data.  Once  the  data  is  digitized  according  to  x,y  coordinates, 
the  data  file  can  be  converted  by  GRIPS  software  to  a "grid  cell"  format 
of  any  scale  or  size  necessary  for  analysis.  This  capability  allows  the 
PIOS  automation  technique  to  Interface  the  GRID  System  and  utilize  all 
the  grid  cell  analysis  and  output  advantages. 

While  computer  transformation  to  grid  cell  matrices  is  normally  used, 
manual  transformation,  called  encoding,  is  also  possible.  Encoding  to 
the  GRID  system  is  the  same  as  digitizing  to  the  PIOS  system.  Encoding 
may  occur  cell  by  cell  or  by  groups  of  cells.  Cell  by  cell  encoding 
usually  occurs  when  data  being  automated  exhibits  a great  amount  of  variation. 
For  variables  where  data  is  largely  homogeneous,  consolidated  encoding  may 
occur.  The  consolidation  involves  identifying  for  each  row,  beginning 
and  ending  columns.  The  cells  interior  to,  and  inclusive  of  the  identified 
columns  are  inherently  identified  as  having  the  same  code.  In  either  case, 
a numeric  symbol  is  chosen  to  identify  the  sub-variable,  such  as  the  topo- 
graphic elevation,  type  of  vegetation,  type  of  soil,  etc.,  most  representa- 
tive of  each  cell.  The  group  of  cells  each  thus  identified,  then  collectively 
represents  by  the  total  matrix.  Once  encoded  in  numerical  terms,  the 
variable  can  be  stored  within  the  computer  as  a digital  data  matrix.  Usually, 
several  data  files  are  similarly  created  for  a specific  area,  each  file 
representing  a different  geographic  variable.  Once  created,  they  can  be 
stored  separately  as  single  variable  data  files,  or  merged  into  one  or 
more  multi-variable  data  files.  The  encoding  procedure  is  similar  for  each 
variable,  although  special  considerations  may  sometimes  be  necessary  due  to 
map  scales,  quality,  and  type  of  available  information,  etc. 

Table  1 outlines  the  procedures  and  outputs  of  Che  PIOS/GRIPS/GRID 
scheme . 


B2 


Table  1 


Proccclures/Outputs  of  the  PIOS/GRIPS/GRID  Scheme 


Program  or  Procedure 

Description 

Input 

Procedure 

X,Y  Coordinate  Digitizing 

A procedure  used  for  recording  x,y 
coordinate  measurements  for  computer 
referencing  of  points,  lines  and 
polygons. 

DIGITIZER  CONVERSION 

Converts  magnetic  tape  computer 
card  Images  of  digitized  data  to  a 
disc  or  a regular  tape  file  In  the 
appropriate  coordinate  system. 

c 

o 

POLYGON  MERGE 

Allows  portions  of  a file  which 
had  been  redigitlzed  due  to  updates 
since  initial  automation,  errors 
found  in  editing,  etc.,  to  be  merged 
into  the  correct  portion  of  the 
parent  file. 

sold 

C5 

a> 

a 

u* 

s 

MATCH  VERTICE 

Consists  of  five  programs  which  allow 
vertices  of  adjacent  polygon  borders, 
which  were  separately  digitized  and 
may  not  coincide,  to  be  matched  and 
represented  by  points  with  identical 
vertices. 

U 

C3 

U 

o 

u 

CO 

EDIT  VERTICE 

Allows  strings  of  matchverted 
vertices  which  comprise  adjacent 
polygon  common  borders  to  be  analyzed 
and  compatibility  resolved. 

DONUT 

Provides  area  calculation  adjustments 
due  to  polygons  completely  contained 
within  other  polygons. 

DSCRIPTOR 

Calculates  the  area,  centroid  and 
minimum  and  maximum  coordinates  for 
each  polygon  and  transforms  numeric 
digitized  data  codes  to  alpha  codes. 

BILINEAR 

Allows  a digitized  map  to  be  trans- 
formed Into  another  coordinate  re- 
ferencing system. 

B3 


Utility  Programs  I Data  Manipulation 


I 


Program  or  Procedure 


Description 


OVERLAY 


Calculates  intersection  coordinates 
of.  polygons  overlayed  on  other  poly- 
gons, thus  identifying  newly  created 
polygons. 


STATISTICS 


POLYMODEL 


Various  programs  for  calculating 
area  statistics  for  polygons. 

A program  that  allows  the  user  Co 
perform  flexible  overlays  and 
rescaling/weighting  procedures  to 
model  polygon  oriented  data. 


DIGITIZER  FILE  GENERATION 


LIST  VERTICE 


CHANGE  VERTICE 


DELETE  AND  SHIFT  VERTICES 


SINGLE  OVERLAY 


IDENTIFIER  FILE  GENERATION 


POINT  REVERSE 


A transition  program  Co  enhance 
the  transformation  of  digitized 
data  for  input  to  the  DICCNV  Program. 

A program  used  to  create  a list  of 
vertices  for  editing  digitized 
data  or  a file  of  selected  data 
for  partial  plotting. 

Allows  the  user  to  manipulate  point 
data  to  correct  errors  made  during 
the  digitizing  process. 

A program  to  shift  vertices  and  to 
eliminate  one  of  two  identical 
vertices  within  the  same  polygon 
resulting  from  the  MATCHVRT  Program. 

A program  to  create  an  overlay  file 
when  the  subordinate  (minor)  variable 
area  Co  be  summarized  is  contained 
entirely  within  a single  polygon  of 
the  dominant  (major)  variable. 

A program  to  create  a concise  file 
of  PIOS  automated  data  which  contains 
identification  records  of  each  polygon 
minus  the  vertice  coordinate  data 
describing  the  polygon  perimeter. 

A program  to  rover.se  the  order  of 
a string  of  digitized  points. 


File  Creation  File  Display  Out- 

and  Manipulation  Creation  put  Programs  Utility  Programs 


Program  or  Procedure 

Description 

CODE  FIND 

Allows  the  user  to  select  specific 
codes  for  generating  selected 
characteristic  plots. 

UTM  CONVERSION 

This  program  calculates  UTM 
coordinates  from  latitude  and 
longitude  coordinates. 

TERRAIN  UNIT  EXPANSION 

Enables  a user  to  expand  the  normal 
PIOS  16  digit  ID  record  up  to 

AO  digits. 

SPLIT  IDENTIFIER 

Allows  a user  to  split  the  Terrain 
Unit  Expansion  File  into  unique 
variable  components. 

ADD  VARIABLE 

Allows  variables  to  be  added  to  a 
file  after  the  Terrain  Unit 

Expansion  and  Split  Identifier 
Programs  have  been  run. 

POLYPLOT 

Creates  computer  drawn  plots  of 
digitized  information. 

DROPLINE  PLOT 

Allows  the  elimination  of  common 
boundaries  between  adjacent 
terrain  unit  polygons  when  similar 
components  have  the  same  code. 

GRIPS 

A program  used  to  convert  x,y 
coordinate  data  to  a grid  cell 
format  (grldded  information  from 
polygons) . 

PILE  GENERATION 

Program  used  to  generate  an  unfor- 
matted, sequential  single  variable 
data  file  from  a cell  by  cell  data 
encoding  technique. 

RECORD  GENERATION 

A program  similar  to  FILCEN,  except 
a consolidated  data  encoding  tech- 
nique is  used  and  windowing  and 
updating  are  possible. 

GRID 


Program  or  Procedure 

Description 

c 

o 

e 

O 

•H  7i 

GRID  MERGE 

Allows  the  user  to  merge  one  or  more 

m 3 

single  variable  files  into  a multi- 

U 

o c 

•H 

•H  tJ 

u*  c 
rS 

variable  file. 

GRID  MODEL 

A program  that  allows  the  user  to 
perform  flexible  overlays  and  re- 
scaling/weighting procedures  to 
model  gridded  geographic  data. 

SEARCH 

Allows  the  user  to  accomplish 

frequency/distance  analyses. 

O 

TOPOGRAPHIC  ANALYSES 

A group  of  programs  used  to  perform 

(3 

various  calculations  for  a)  slope 

a 

b)  aspect  c)  sun  intensity  d)  grading 

c 

e)  visual  exposure. 

SPECIAL  ANALYSES 

Programs  to  perform  various  calcu- 

lation  and  analysis  problems  such  as 

(§ 

a)  tlme/distance  search  b)  gravity 
models  c)  diffusion  models  d)  area 
calculation  e)  frequency  calculation 
f)  aggregation  analysis  g)  statisti- 
cal analysis,  including  standard 
deviation,  mean  et  al.  h)  sort/ 
cross  tabulations. 

CELL  UPDATE 

Program  used  to  update  cell  or 

area  data. 

Ofi 

o 

WINDOW  FILE 

A program  used  for  identifying 

a portion  of  a data  file  for  selected 

>s 

area  analyses. 

•H 

4J 

£3 

RECORD  GENERATION 

Accomplishes  both  of  the  above. 

CELL  LIST 

Allows  the  user  to  list  the  inventory 

of  a given  cell. 

O & 

WINDOW  LIST 

This  program  summarizes  the  inventory 

C 3 

W O 

of  specified  polygon  or  variable  area. 

«J  C 

A program  which  summarizes  the  inven- 

o»  o 

LINE  LIST 

iJ  i-» 

tory  of  a specified  linear  zone  of 

E 

data. 

86 


Display/Output  Programs 


Program  or  Procedure 

Description 

GRIDPRINT 

A program  which  allows  the  user 
to  generate  "line  printer"  maps. 

CRIDPLOT 

A program  which  generates  "pen 
plotter"  maps. 

ELECTROSTATIC  CRIDPLOT 

A program  which  generates  electro- 
static plotter  maps. 

GRIDVIEWS 

A program  to  generate  a non- 
contlnuous  display  of  3-D  grid 
data. 

VIEWS 

A program  to  display  a continuous 

3-D  surface. 

87 


HAfrtO 

CCOCRAPHIC 

INFOIOWTION 


I I 

" (4 


UTILITY 

PROC?-OtS 


POLYTLOT 
(Final  plot) 


OVrp.lAY  (Cal 
culace  inter 
section  poly 


STATISTICS 

PROCRAM 


f.RtPS  (poly 
gon  r.c 
cell  cal- 
culation) 


/crid\ 


an 


EAStMAP 

DATA 


2 PIPS  Software 

The  primary  purpose  of  PIPS  is  to  create  an  automated,  geographic 
oriented,  multi-variable  data  file  of  polygon,  point  and  line  data  which 
can  be  used  individually  or  in  overlay  combinations  for  various  analyses 
and  graphic  outputs. 

Various  component  programs  comprise  PIPS  and,  following  the  digitizing 
of  the  data,  can  provide  the  user  the  above  capabilities. 

These  component  programs  are  summarized  in  the  following  section. 

^ ^ Digitizer  Conve rsion  - DIGCNV 

The  Digitizer  Putput  Conversion  Program  (DIGCNV)  converts  digitized 
data  from  magnetic  tape  to  a disc  or  tape  file  in  various  formats. 

The  program  has  the  capability  to  edit  the  identifiers  associated  with 
each  digitized  polygon  or  line.  A list  ot  the  identifiers  and  the 
coordinates  associated  with  each  polygon  may  be  produced  by  this  program. 

Conversion  to  another  referencing  system  is  possible.  This  is 
provided  for  by  digitizing  at  least  three  tic  marks  with  coordinates 
relative  to  the  coordinate  system  into  which  the  data  is  to  be  trans- 
formed (i.e.,  State  Plane,  L'niversal  transverse  Mercator,  etc.).  This 
is  accomplished  just  prior  to  digitizing  the  map.  The  DIGCNV  program 
calculates  the  appropriate  factors  for  this  transformation  for  various 
pairs  of  digitized  tics  and  selects  the  pair  to  be  used,  based  on 
several  user  supplied  options. 

2.2  MERGE  FIl-E  (HERGFtLEl 

Sometimes  a portion  of  the  digitized  data  is  not  satisfactory  and 
must  be  redlgitized.  The  merge  file  (MERGFU.E)  program  allows  up  to 
two  of  these  digitized  files  to  be  merged.  Various  features  are 
available  to  edit,  delete,  or  insert  Che  polygons  from  the  redigicized 
data  file  into  the  original  data  file. 

2 . 3 HATCH  VERTtCE  (MVTCHVRT) 

The  purpose  of  the  match  vertice  portion  of  PIOS  is  to  eliminate 
sliver  vinderlaps  and  overlaps  of  double  digitized  data  points  (vertices). 
Slivers  sometimes  occur  because  each  vertice  is  digitized  twice,  once 
for  each  polygon  encompassing  the  common  border.  Due  to  such  dual 
auComaClon,  the  digitized  points  may  not  exactly  coincide.  The  match 
vertice  portion  of  riOS  utilizes  five  programs  in  consonance  with  a 
user  specified  toler.inee  d ist.ince  between  matching  vertices  to  resolve 
this.  It  should  be  noted  th.ic  programs  Z .and  4 .ire  -standard  system 
sort  routines  and  are  not  controlled  by  the  user  as  are  I,  3,  and  5. 
Sometimes  consecutive  points  on  a line  segment  fall  within  this  specified 
tolerance  and  are  m.itchver  ted . This  creates  two  points  with  the  same 
coordinates.  For  suhseiiuent  processing  eftlclcncv  (such  as  for  the 
EDITVERT  program),  the  m.itchvert  progr.iras  eliminates  duplicate  points, 
also. 


90 


2.4  EDIT  VERTICE  (EDITVERT) 

The  Edit  Vertice  (EDITVERT)  portion  of  PIOS  is  composed  of  5 
computer  programs.  These  programs  are  used  to  edit  interfacing  line 
segments  (i.e.,  the  double  digitized  series  of  vertices)  which  repre- 
sent the  overlapping  boundaries  of  adjacent  polygons.  Selected 
polygons  or  an  entire  file  may  be  edited  with  EDITVERT.  EDITVERT 
changes  vertices  as  needed  to  correct  the  original  PIOS  file.  Two 
types  of  changes  are  made:  1)  a point  is  added;  2)  two  points  are 
averaged.  A point  is  added  when  a series  of  double  digitized  vertices 
representing  a line  segment  between  adjacent  polygons  contains  an 
unpaired  vertice.  The  unpaired  vertice  is  added  in  the  proper 
position  in  the  otherwise  identical  line  segment.  This  may  be 
accomplished  only  within  a string  of  matched  line  segment  vertices, 
not  at  either  end. 

Averaging  two  vertices  occurs  when  two  unpaired  vertices,  one 
from  each  of  the  two  adjacent  polygons,  exist  within  matching  line 
segments,  in  the  same  position  in  the  otherwise  paired  strings.  These 
similarly  positioned  unpaired  vertices  may  be  averaged  when  they  are 
within  a preset  tolerance  of  each  ocher. 

2.5  DONUT 

As  mentioned  in  the  digitizing  portion  of  this  documentation,  the 
DONUT  program  is  used  Co  manipulate  the  coordinates  for  specially 
digitized  polygons  which  completely  surround  other  polygons.  These 
are  called  donut  polygons  because  holes  are  left  in  the  surrounding 
polygons  when  the  areas  of  the  surrounded  polygons  are  deleted. 

A maximum  of  five  levels  of  donuts  may  be  used  with  this  program. 
Level  one  is  the  outside  polygon  and  Che  level  increases  by  one  until 
the  Innermost  polygon  is  reached.  Polygons  which  do  not  contain  other 
polygons  and  are  not  contained  themselves  are  given  a level  "0" 
classification.  The  order  in  which  the  polygons  are  digitized  is 
important.  Figure  46  shows  the  correct  level  of  Che  various  polygons 
and  the  order  in  which  the  polygons  must  be  digitized. 


91 


The  program  auto-iatically  creates  new  sets  of  polygon  coordinates 
with  the  donut  coordinates  inserted  in  counter  clockwise  order  after 
the  first  point  of  each  appropriate  surrounding  polygon.  The  branch 
is  then  back  to  the  first  point  to  create  a hidden  line.  This  is 
continued  for  each  donut  area  before  continuing  clockwise  around  the 
vertices  of  the  surrounding  polygon.  The  procedures  mentioned  assure 
that  the  proper  status  of  each  polygon  is  considered  when  statistical 
runs,  such  as  area  calculations  are  made. 

2.6  DESCRIPTOR  (DSCRPTOR) 

The  coordinate  descriptor  calculation  program  (DSCRPTOR)  is  used 
to  calculate  Che  area,  centroid,  and  minimum  and  maximum  coordinates 
for  each  polygon.  Prior  Co  final  plotting,  DSCRPTOR  is  used  as  an  edit 
step.  The  area  calculation  portion  of  DSCRPTOR  provides  this  edit 
(l.e.,  if  an  area  calculation  for  a polygon  yields  a "0",  the  polygon 
is  not  defined  by  a "closed"  path,  that  is  digitizing  did  not  return  to 
the  origin;  if  Che  area  calculation  is  negative,  the  polygon  was 
digitized  backwards  when  it  should  not  have  been).  For  final  plotting, 
the  16  digits  of  the  identification  records  are  split  into  the  appro- 
priate Identifiers  and  converted  to  alpha  characters,  as  in  the  case 
of  Che  land  use  polygons.  These  polygon  identifiers  are  then  output 
to  form  a new  file  structure  acceptable  to  the  overlay,  plotting,  and 
statistical  printout  programs. 

2.7  POLYGON  PLOTTING  (POLYPLOT) 


High  quality  graphic  representations  are  a basic  output  of  the  PIOS 
system.  The  POLYPLOT  program  is  used  to  create,  edit,  and  produce 
final  computer  drawn  plots  of  the  polygon  outlines  at  various  scales 
with  optional  title  data  for  map  identification.  Plotted  polygon  labels 
may  be  either  numeric  or  alpha,  such  as  is  used  for  land  use.  The 
reference  tic  marks  are  also  plotted  to  show  the  relation  Co  another 
coordinate  system,  such  as  state  plane. 

2.8  DROPLINE  PLOTTING 

DROPLINE  is  a plotting  program  with  card  input  data  similar  to  the 
PolyploC  program.  Sometimes,  polygon  coding  (and  subsequent  plotting) 
Involves  the  same  assigned  code  for  contiguous  polygons.  This  often 
occurs  when  dealing  with  terrain  units  where  the  second  ID  record  for 
a polygon  encompasses  . everal  codes.  When  plotting  one  variable 
represented  by  a specific  code  location  within  the  terrain  unit  expansion 
ID,  two  contiguous  polygons  for  that  variable  may  have  the  same  code. 
Thus,  one  homogeneous  area  is  represented  by  two  polygons.  Sometimes 
this  is  visually  misleading  and  may  not  be  desirable.  DROPLINE  creates 
plots  where  the  common  boundary  between  Che  two  Identically  coded 
polygons  has  been  eliminated. 

The  DROPLINE  process  actually  begins  by  running  the  EDITVERT 
sequence  of  programs  on  the  desired  data  file.  The  output  file  from 
the  second  sorting  step  (EDITVERT  program  4)  is  then  stored  (using  JCL)  . 
This  EDITVERT  processing  step  is  normally  in  addition  to  a prior  EDITVERT 


92 


> 


m-  .-i 


submission  which  was  used  Co  ''clean''  Che  file.  This  second  EDITVERT  { 

run  creaCes  sCrings  of  vercices  (polnCs) , IdenClfled  by  association  \ 

wlch  Che  polygon  on  Che  lefc  and  Che  polygon  on  the  right.  The  program 
check  the  codes  of  the  polygons  on  either  side  of  point  strings  interior 
Co  Che  study  area  and  plots  only  the  strings  where  these  codes  are 
different.  For  vertlce  strings  which  correspond  to  the  m.ap  borders, 
only  one  of  these  polygons  exists.  These  border  strings  are  always 
plotted . 

2.9  BILINEAR 

The  digitizing  process  generates  an  automated  computer  readable 
copy  of  a map  which  is  the  same  size  as  the  base  map  original.  The 
BILINEAR  computer  program  is  used  to  transform  this  map  into  another 
coordinate  system  such  as  UTM  or  state  plane  coordinates. 

This  bilinear  transformation  is  possible  because  digitizer 
coordinates  are  measured  for  four  locations  called  "TIC"  marks  for 
which  coordinates  in  the  new  system  are  known.  Since  measurements  in 
both  systems  are  known  for  these  points,  all  digitizer  measurements 
can  be  transformed  to  the  new  referencing  system.  The  new  coordinate 
system  eliminates  distortion  which  existed  on  the  original  map,  allows 
accurate  area  calculations,  and,  because  of  referencing  separately 
digitized  maps  with  a common  system,  provides  a method  of  Joining 
adjacent  maps. 

2.10  OVERLAY  i 

The  OVERLAY  program  calculates  the  coordinates  of  the  intersection 
polygons  resulting  from  mathematically  laying  the  polygons  from  a 
dominant  (major)  map  on  top  of  the  polygons  of  a subordlnant  (minor) 

map.  j 

I' 

Polnt-in-polygon  and  line  intersection  techniques  are  used  to 
construct  the  coordiantes  of  the  intersection  polygons.  The  dominant 

polygon  identifier  is  added  to  that  of  each  subordinate  polygon  or  ; 

portion  thereof  to  form  the  identifier  for  each  intersection  polygon.  j 

.) 

The  output  file  of  the  OVERLAY  program  is  in  the  same  format  as  the  jj 

two  Input  polygon  files  and  may  be  thus  used  as  input  to  perform  a 

subsequent  overlay.  ,{ 

2.11  STATISTICAL  PRINTOUTS 

The  STATISTICAL  PRINTOUT  portion  of  the  PIOS  system  consists  of 
five  programs,  each  designed  to  produce  one  or  two  specifically  formatted  , 

area  calculation  printouts.  The  programs  are  used  in  conjunction  with 
files  identified  by  means  of  job  control  cards.  Input  specifications 
apply  to  files  so  identified.  The  first  progr.im  produces  area  calcula- 
tion listings  by  individual  minor  polygon  codes.  The  second  two  programs 

produce  area  calculation  listings  by  major  polygon(s)  showing  individual  i 

odnor  polygon  codes.  The  last  two  programs  produce  area  calculation 
suimnarles  based  on  a consolidated  code,  such  as  residential  land  uses 
aggregated  for  "R-l",  "R-2",  and  "R-3"  into  one  consolidated  residential 


1 


93 


land  use  code  of  "R".  Each  program  Is  described  briefly  below  based 
on  the  output  which  each  program  produces. 

1.  Minor  Polygon  Summary  - Produces  basic  (or  selected)  area 
summary  listings  by  individual  minor  polygon. 

2.  Minor  Polygon  Summary  - One  Overlay  - Produces  area  summary 
listings  by  individual  minor  polygon  for  polygons  associated  with 
one  overlay  (e.g.,  land  use  polygons  by  census  tract). 

3.  Minor  Polygon  Summary  - Two  Overlays  - Produces  area  summary 
listings  by  individual  minor  polygon  for  polygons  associated  with 
two  overlays  (e.g.,  land  use  polygons  by  census  tract  and  General 
Plan  Area) . 

4.  Minor  Polygon  Summary  (Consolidated  Code)  - One  Overlay  - 
Produces  area  summary  listings  for  minor  polygons  aggregated  by 
code  type,  for  polygons  associated  with  one  overlay  (e.g.,  con- 
solidated land  use  by  utility  service  district). 

5.  Minor  Polygon  Summary  (Consolidated  Code)  - Two  Overlays  - 
Produces  area  summary  listings  for  minor  polygons  aggregated  by 

code  type  for  polygons  associated  with  two  overlays  (e.g.,  consolidated 
land  use  by  utility  sub-region  showing  area  in  national  forest)  . 

2.12  Utility  Programs 

The  PIOS  system  also  contains  twelve  programs  whose  purpose  are  to 
enhance  data  through  various  updating,  editing,  and  manipulating 
capabilities.  A brief  summary  follows: 

1.  DIGITIZER  FILE  GENERATION  (DIGFILGN)  - Transforms  digitized 
data  for  Imput  into  the  DICCNV  program.  No  data  manipulations 
are  associated  with  this  program. 

2.  CHANGE  VERTICES  (CllVERT)  - This  program  allows  the  user  to 
modify  polygons  already  on  the  data  file  or  to  add  additional 
polygons  to  the  data  file.  The  user  may  modify  existing  polygons 
by  changing  vertice  coordinates,  deleting  vertice  coordinates 

or  adding  coordinates.  Also,  by  giving  a string  of  vertices,  the 
user  may  create  a new  polygon. 

3.  LIST  VERTICE  (LTVERT)  - LTVERT  allows  the  user  to  create  a list 
of  x,y  coordinates  for  polygon,  line  or  point  data.  The  selection 
of  polygon,  line  or  point  sequence  numbers  is  at  the  discretion  of 
the  user.  The  list  generated  is  used  for  editing  the  vertices  of 
all  or  of  a selected  number  of  polygons,  lines  or  points.  Data 
selected  for  listing  often  depends  upon  apparent  errors  such  as 
contiguous  polygons  with  silver  errors,  extra  lines  in  the  automated 
data,  etc.  Based  on  an  edit  of  generated  listings,  changes  are 
subsequently  made. 


> 


This  proftram  may  also  be  used  lo  create  a temporary  output 
file  to  plot  selected  polygons,  lines  or  points. 

4.  IV^tj^and  Shift  Vortices  (RASIU'KRT^  - The  Delete  and  Shift 
Vertices  (DASHTCRT)  program  allows  the  user  to  eliminate  one  of  two 
Identical  vertices  within  the  same  polygon  resulting  from  the  Match 
Varllce  program.  This  only  applies  to  consecutive  vertices 
dlgl  tired  for  a single  polygon  which  happen  to  occur  within  the 
Hatch  Vertlce  tolerance;  It  does  not  eliminate  double  digitized 
vertices  for  contiguous  polygon  "shared"  borders.  This  purpose  of 
DASHVKRT  Is  seldom  used  since  MATCIIVF.RT  does  this  automatically. 

The  DASHVERT  program  also  allows  the  coordinates  of  the  data  file 
to  be  shifted  to  f.icilltate  and  assure  coincidence  with  another 
variable  for  use  by  the  overlay  program.  Since  coordinate  recorda- 
tion Is  accomplished  using  a cartesian  system,  a subtractive  shift 
is  required  to  change  the  coordinates  and  approximate  a move  of  the 
data  file  toward  the  origin. 

Single  Overlay  (SINCLOVl.)  - This  progr.am  allows  the  user  to 
create  an  overlay  file  when  the  subordinate  (minor)  variable  area 
to  be  summarized  Is  contained  entirely  within  a single  polygon 
of  the  dominant  (major)  variable. 

6.  Idcntlf Icatlon  File  Ceiy>y^tlon  (TDFILCN)  - The  Identification 
File  Generation  (inFIFGN)  program  allows  the  user  to  create  a con- 
cise file  of  I’lOS  automated  data.  This  concise  file  consists  of 
the  identlf ic.it ion  records  of  each  polygon  minus  the  vertlce 
coordinate  data  describing  the  polygon  perimeter.  Creation  of 
this  identification  record  file  is  for  efficiency  purposes.  It  la 
used  by  the  statistical  summary  programs,  and  data  manipulations 
(e.g.,  record  sorts)  are  much  faster  and  ea.sler  since  the  coordinate 
value  "strings"  do  not  enter  into  the  processing  procedures. 

7.  Point  Reverse  (PTRFVR.SF)  - The  Point  Reversal  program  is  used 

to  reverse  the  order  ot  the  digitized  vertices  describing  the  outline 
of  a polygon.  This  needs  to  be  done  If  the  vertices  for  a polygon 
were  erroneously  digitized  in  .i  counter  clockwise  order.  Since  the 
DSCRPTOR  program  assumes  clockwise  order  for  area  calculation,  a 
negative  area  sometimes  flags  a polygon  with  counter-clockwise 
points.  The  point  rever.se  program  allows  this  to  be  corrected. 

The  reversing  operation  is  performed  only  on  specif i’d  polygons. 
Polygons  which  contain  the  estimated  centroid  as  the  Kist  vertice 
are  identified  so  the  centroid  is  not  included  In  the  point 
reversal.  Only  a portion  of  the  vertices  may  he  reversed,  as  In 
the  case  of  reversing  only  the  vcrtlci-s  tor  a donut  polygon  after 
the  digitized  points  were  run  through  the  donut  polygon  ptogr.im. 

8.  Code_Flnd  - This  program  reads  cither  a two  or  a three  record 
PIOS~structured  file.  I’re-detcrmlned  linear  or  polygon  d.ita  codes 
apecitled  by  the  user  are  rc.id  and  tlieir  .issociated  PIOS  data  are 
output  to  .1  temporary  file.  Norm.illv,  this  newly  created  temporary 
file  Is  Input  to  the  I’OLYPU'T  progiam  to  create  a plot  of  selected 
characteristics . 


95 


▼ 


9.  UTM  Coo rd  1 n.i t e Cul  C'll  .it  I ons  (UTMCONV)  - 'lhi>  I'TMCONV  pronraii 
calculates  and  converts  coordinates  irom  latitude  and  longitude 
Inputs.  Conversely,  It  calculates  latitude  and  longitude  from 
UTM  Inputs.  The  calculations  occur  using  specific  points  called 
TIC  marks,  for  which  either  latltude/longltude  or  UTM  coordinates 
are  known. 

10.  Terrain  Unit  Identifier  Expansion  (TUEXPAND)  - Polygons  automated 
using  the  PIOS  system  are  most  often  described  by  two  records; 

one  Is  a 16  digit  record  of  Identifiers  which  specifies  among 
other  things,  the  map  number,  the  polygon  sequence  number  and  the 
polygon  code;  the  other  Is  a record  of  coordinates  which  define  the 
Identified  polygons  outline.  For  most  applications,  the  four  to 
six  digits  (of  the  total  sixteen)  allowable  for  the  polygon  code 
In  the  record  of  Identifiers  is  sufficient.  This  Is  because  a 
polygon  is  normally  associated  with  only  one  type  of  variation 
(l.e,,  a type  of  vegetation  o£  a type  of  soil  o£  a type  of  geology, 
etc.).  However,  the  advent  of  terrain  unit  mapping  generated 
polygons  with  multi-variable  characteristics  (l.e.,  a type  of 
vegetation  and  a type  of  soil  ajn^  a type  of  geology,  etc.)  each 
characteristic  possibly  requiring  a four  to  six  digit  code.  Thus, 
the  four  to  six  total  digits  normally  allowable  for  a polygon  code 
Identifier  In  Che  first  record  Is  not  adequate. 

In  fact,  up  Co  forty  digits  may  be  required  to  sufficiently 
Identify  the  multiple  characteristics  associated  with  terrain  unit 
polygons. 

For  this  reason,  a second  Identification  record  may  be 
generated  Co  allow  additional  description  capabilities.  The  second 
identification  record  lists  (as  a character  string)  the  multiple 
characteristics  which  describe  a polygon  (or  polygons).  This  unique 
sequence  of  characteristics  is  In  turn  assigned  a numeric  terrain 
unit  code.  Referencing  this  terrain  unit  code  in  the  first 
Identification  record  (where  the  four  to  six  digit  variable  code  is 
normally  placed  for  a single  variable  polygon)  llt\ks  the  first 
Identification  record  to  the  second  Identification  record. 

Whether  the  first  identification  record  Is  used  by  Itself,  or 
In  conjunction  with  this  second  identification  record,  the  coordinate 
record  remains  the  same . 

If  a second  Identification  record  Is  required,  the  Terrain  Unit 
Identifier  Expansion  Program  generates  this  second  record  for  each 
polygon. 

11.  Split  Polygon  Identifiers  (SPLITIDS)  - The  SPLIT  IDENTIFIER 
program  Is  used  to  restructure  terrain  unit  data  files  for  use  by 
the  POLYMODEL  program. 

The  TERRAIN  UNIT  EXPANSION  program  was  the  first  step  toward 
providing  PIOS  modeling  capabilities.  It  linked  unique  sequences 
of  geographic  characteristics  (listed  In  a table  of  possible 
variations  such  as  sequence  I ■ soil  xj , vegetation  yj;  sequence  2 - 


96 


soil  xj  , vegetation  V;;,  otc.)  to  polygons  In  a terrain  unit 
polygon  data  tile.  I'.ich  polygon  was  thus  asslgneil  a set  of 
geographic  charac ter  is t ics  which  idcutlllod  its  Internal  variation. 
The  assigned  characterist  ics  were  storcil  (‘Ithei  as  .in  expansion 
Identifier  record  (In  addition  to  the  normal  In  digit  Identitlei 
record  and  the  coordln.itc  record)  or  .is  an  independent  file.  Tho 
expansion  identifier  record  w.is  ,in  integral  part  ot  e.ich  polygon 
description.  The  independent  file,  when  acces.sed,  related  to  the 
original  file  simply  by  association  on  .1  one  to  one  basis  (i.e,, 
the  first  expansion  It)  rehites  to  the  first  polygon,  etc.).  For 
efficiency,  both  TFRRAIN  I'MIT  EXPANSION  output  tiles  used  the 
character  (alphameric),  versus  the  binary  (numeric)  computer 
Storage  concept.  The  expansion  Identifier  was  thus  a continuous 
string  of  characters.  A specific  variable  could  thus  be  accessed 
only  by  Identifying  the  beginning  and  ending  digits  which  recorded 
the  associated  variation  (I.e.,  digits  7-10  of  the  sO  digit 
expansion  ID  reflect  vegetation  type).  Wtille  the  character  string 
storage  concept  Is  efficient,  character  storage  does  not  directly 
adapt  to  arithmetic  processing,  and  Identifier  strings  are  continuous 
(I.e.,  vegetation  cannoc  be  addressed  as,  .say,  variable  5,  or 
varl.able  17,  It  must  be  addressed  as  digits  7-10,  or  digits  26-29, 
etc . ) . 

The  SPLIT  IDENTIFIER  progr.ara  was  developed  to  transform  the 
TERRAIN  UNIT  IDENTIFIER  generated  character  files  into  numeric  tiles 
represented  by  easily  manipulated  integer  values.  In  addition.  It 
transforms  the  continuous  char.icter  strings  into  discrete  components 
(I.e.,  vegetation  may  be  variable  1,  soils  may  be  variable  4,  etc.). 
This  facllltiates  the  modeling  process  and  specifically  generate.s 
a multi-variable  file  In  a format  usable  by  POLVMODEL. 

12.  Add  Variable  (AnPVARl’L)  - ADD  VARIABLE  allows  variables  to  be 
added  to  a file  which  has  been  e.xpanded  by  the  TERRiMN  UNIT  EXI’ANSION 
program  and  subsequently  restructured  by  the  SPLIT  IDENTIFIER  program. 

After  the  expansion  and  restructuring  process,  a polygon  has  a 
number  of  characteristics  associated  with  it  (I.e.,  a type  of  soil, 
a type  of  vegetation,  etc.)  which  can  be  Identified  as  variable  one, 
variable  two,  etc.  Sometimes,  one  (or  more)  of  these  variables 
represent  variations  which  In  turn  encompass  composite  characteristics 

For  Instance,  Instead  of  polygons  being  represented  bv  different 
soil  types,  polygons  mav  be  represented  bv  different  soil  series. 

Each  soli  series  c l.iss  1 1 icat  ion  may  be  typified  by  a speeltlc  ph 
factor,  a specific  erodlbtlity  factor,  I’tc.  After  the  SPLll'  IDENTI- 
FIER progr.ira,  all  that  would  be  known  per  polygon  would  be  the  soil 
sarlcs  category.  For  moilellng  purposes,  the  individual  components 
of  lha  Boll  series  mav  be  necessary.  In  such  cases,  the  soli  series 


^It  should  be  mentioned  that  the  ADD  VARIABLE  program  can  be  used  to  expand 
a Multl-Varlab  le  Orid  File. 


numeric  IJeiitltlrr  c.»n  U'  t u*  <«  "kcv"  to  the  si'ecltlc  chdi  iic  t e r l»- 
tlra  anaoctdteJ  with  tliot  ('.tilicul.it  »oll  nei  lea  cotle . I'heae 
aaaocloteil  char.ic  t er  la  t i>  a cju  then  be  d'tdeil  to  the  Sl'l.lT  IDl'.NVll'ltK 
FILE,  Juat  aa  teiraln  unit  C'>m)<oiiei<t a we'.e  d'KleJ  to  the  terialn  unit 
file  ualiiK  the  TERRAIN  INIT  t-NfANSlON  (iroKt.tm. 

VIhlle  the  eMomple  oar. I .t  a ll  serlea  coile  pel  polvRon  aa  a 
"key"  to  aJJltli'iial  »lata  which  could  be  added  to  the  Sl'l.lT  llHN'Tl- 
FIER  file,  aevei'al  vaiiablea  couUl  be  uaed  aa  a Ci'inpoalte  key  to  add 
additional  data.  Thla  could  be  likened  to  a foinn  ot  na'dellu)!  (l.e.. 

If  the  aoll  aerlea  key  variable  code  la  001  I and  the  neolosv  kev 
variable  code  la  add  an  etodlbllllv  tactei  v't  aa  a new 

varlable> . 

The  charactei  lat  lea  aaaiiclated  with  the  aub-cateijorlea  ot  each 
variable  which  la  actlnji  as  a "kev"  are  known  il.e..  aoll  aerlea 
0013  - ph  factor  and  erodibllltv  tactor  E,  soil  aerlea  0014  - ph 
factor  0 and  erodibllltv  tactor  K,  etc. 3.  Thla  variability  waa 
pre-deteimlned  In  orvler  to  eat.ibllah  a codln>;  aeituence  to  deacrlbe 
polygon  variation.  Thla  table  of  poaalb  1 1 1 1 lea  mav  be  added  to 
the  file  by  the  Al'O  VARl.Vbl.E  program.  The  only  unknown  la  the 
code(a)  ot  the  key  varlablela)  per  polygon. 

The  AOO  VARIABIF.  program  esamlnea  the  Sl'l.lT  IDENTIFIER  file 
polygctn  by  polygon.  It  i.lentlflea  the  codeia)  aaaoclated  with  the 
"key"  varlable(a)  and  links  thla  code,  or  specific  set  ot  codes,  to 
the  appropriate  characteristic  or  set  of  char ac ter  I at  lea  Input  bv 
the  iiaer.  The  ch.ai.ictei  isl  ica  ti'  he  aaaoclated  with  that  code,  or 
• et  of  codes,  are  then  ad. led,  aa  additional  variables,  to  the 
v.srl«bles  which  alrcadv  detlne  the  pv'lvgoii  being  examined.  The 
SPLIT  IDENTIFIER  tile  la  thus  expanded. 

A printout  may  be  spec  I tied  which  Hats,  polygon  by  polygon, 
the  old  versus  the  new  variable  data, 

2.13  POLYHIDEL 

FOLYMODFL,  as  the  name  icplies,  .illows  polygon  modeling.  Th.rt  la, 
the  geographic  variations  (i.e.,  typo  ot  soil,  type  c>t  vegetation,  etc.), 
characteristic  of  a polygon  within  a study  area  can  be  exanined  and 
compared  to  the  s.ime  fe.itnic-.  .'f  other  polygons  within  the  stndv  area. 

Thla  capability  can  be  used  to  dovolop  polygi'ii  rankings  which  answer 
planning  questions,  such  .is  "„h.it  are.is  within  the  study  .nrea  are  moat 
capable  or  most  suitable  tv'i  .<  particular  use".  I'Ol.) MODEL  is  used  tu 
conjunction  with  a single  Ccrr.iin  unit  file  which  has  been  icstiuclnred  by 
the  SPLIT  IDENTIFIER  REOOi’d)  btilit,’  program.  The  new  slructuie  Identities 
variables  liul  Ivldnal  ly  instead  ot  .is  .i  portion  ot  the  ci'i.ipo>ite  terrain 
unit  file.  With  I'Ol YMv'i'El.  t;i.-  user  has  the  option  to  l.lentify  any  ot  the 
variables  as.sociated  with  e.ich  teii.iin  unit  p.ilygon  and  manlpnl.ite  them. 

As  mentioned,  the  manipulat  lo.ia  actn.illi  lepiesont  usei  defined  m.'dels 
necessary  to  satisfy  unique  analysis  needs  (sncli  as  identifying  soil 
aroaloii  potential),  .ind  ui.iy  enco.-ipass  a sophist  ic.it  ed  Intei.u'tion  ot  many 
variables  which  incUivle  additive  and/oi  ranlt  Ipl  ic.it  Ive  weightings,  etc. 
When  aever.il  variables  aie  involved,  the  appropri.ite  variables  assvu-iated 
with  each  polygon  aie  sepai.itelv  tested  on  a polygon  by  polygon  basis.  As 


<»9 


> 


these  apiiropr  i.it  e variables  '.t’-  tast.-d,  \/ol;',hted  values  are  iletermiaed 
depending;  on  the  user  spec  i I i d test  criteria.  A cuciul.it  ive  weighted 
total  (a  ranking)  evolves  lor  each  polygon  and  a file  is  p.o.ier.i  t ej . 

The  composite  poUgons  which  ton  the  study  are.i  my  then  be  n.ippeJ  .iJ 
the  tiodol,  with  nanjric  indu  ifors  (usinp,  1’uI.Vl‘l.Oi)  or  lip.ht.'i  or  vlatkcr 
• hadings  (usliv’.  AUTih'loD  , representing  the  constr.iints  or  opportuni t ios 
of  a lanJsc.ipe. 

While  such  rodellnp.  capabilities  norn.illv  lemiire  special  prograns 
to  ha  written  by  computer  pcop.r.imers , I’Ot.V.'.OnFl.  is  a type  of  compiiler 
language  which  was  specific.iUv  rie.iteJ  tor  use  by  non-t  och.alcal  iiseis 
and  actually  allows  the  users  to  "write"  (create)  these  speri.'l  pro;\i  .ams . 
No  special  education,  mater  la  I s , etc.,  are  necessary  other  than  the 
PIOS  system  and  this  document  at  ion  ot  I’iU.YV.OhhL. 

To  provide  the  above  rentioned  laodeline,  c.rp.ibi  1 i r ies  , rOl.YT'.OlIV  1. 
Itrcorporates  first,  a control  card  which  describes  the  terrain  unit  file 
balng  Input  and,  second,  a nuiiher  ot  specitic  v’orpose  comi^jnds.  The 
program  Is  flesihle  and  the  user  is  able  to  specify  variations  of  the 
F.'i.l'MOf'l'L  co.iCT.Jnds  and  their  components.  Such  variations  .illow  the  user 
to  control  the  analyses  desired. 

Regarding  the  I’Ol.Yl'.OOr.i.  co-'aads,  each  coc.saanJ  and  its  o.suoctatcd 
data  is  input  on  a single  card.  All  cards  have  the  sar.-.e  forriat;  idiile  the 
various  format  fields  pej;^  card  allow  dlfteront  dat.r  inputs,  for  simplicity 
tha  same  dat.»  input  structurv  .■pplios  Co  each  co-T.ind  caid.  The  sttncCvire 
incorporates  : f.rur  dita  ideat  i i ic.rt  ioa  fields;  thiee  dat.i  n.inipjl.M  ioa 
fields',  and  one  te.vt  field.  Vn?  inputs  per  card  vary  according  to  the 
connard  being  input  on  th.it  card  in  th.it  diffeieat  ti-alds  will  be  iisoil  tiir 
different  vormands. 

3 GRIPS  Software 


The  main  purpose  of  the  GRIPS  system  is  to  convert  polygon  oriented  data 
Into  a file  form  usable  by  the  progr.iros  in  the  GRIP  System  The  input  layv'ut 
Is  similar  to  programs  In  the  AGTi’.'LU’  11  f.irailv;  however,  there  are  several 
slgnlticant  dlttorences.  The  most  s igni f (cant  ditlerence  is  that,  in 
CRIPS,  the  polygon  d.ita  values  are  input  at  the  s.ime  time  (.and  normally  from 
the  same  fllel  as  the  coordinates.  The  GRIPS  svstora  consists  ot  three 
programs.  The  first  program  Is  used  to  create  an  unsorted  Base  Map  Image 
Fll#  (BMIF)  . The  second  program  is  the  s.ime  tile  sorting  progr.im  used  in 
AUTOMAP  II.  It  reads  the  unsotted  BMIF  and  outputs  a sorted  BMiK.  The 
third  program  is  a file  conversion  program.  This  program  reads  the  .sorted 
BHIF,  changes  the  file  structure,  and  outputs  a single  v.srlable  tile  for 
use  by  the  GRID  system.  However,  the  last  two  programs  require  no  user 
Input  and  are  completely  controlled  by  system  control  cards. 

A CRIP  Sof tvare 

The  GRID  computer  graphics  information  system  is  a general  purpose 
computerized  planning  tiH>l  tor  the  collection,  storage,  manlpulat  ton, 
analysis,  .ind  present.it  ion  of  geographic.il  Iv  arranged  d,itj.  It  consists 
of  several  progr.ims  which  produce  many  unique  an.ilysls  .ind  display  tunctlons 


<)9 


PW— — — ^ 

■ 

\ 


ust'tul  tor  plannoia,  deai^uci  s , .lu.t  ruj;liieot». 

Th<  prOKr.inw  due  spoc  i t K .<  1 1 V JeaisnoJ  for  u:.e  by  persons  with  little 
aotti.il  pro)(r4:'.unln>!  eKperience.  lor  a ^tven  stvnlv,  many  separate  analysis 
anJ  illsplav  tuiu  tlous  uu»y  be  wat  ranted.  I'he  piojtiams  are  desij^ned  to  be 
compatible  and  elticient,  with  pielininary  encoding  and  file  creation  steps 
one  time  proc«'diii  e.s  not  needlny  do,' I icat  ton  each  time  a variable  is  to  be 
used.  Kach  vaii.ible  c.in  be  let.  ined  as  .i  slnyle  varl.rble  data  tile  to  bo 
used  atonit  with  other  similar  .stiiL;le  variable  data  tiles,  or  several  such 
files  can  he  merged  to  tonn  a mu  1 1 i -v.i  r lab  le  data  file.  Klther  way,  an 
entire  data  banW  ol  envlroiuneivtal  intonnation  is  avalKible  for  the 
various  lOkiulied  analyses. 

While  the  Interaction  ot  the  component  projtrams  Is  essential,  e<iuallv 
Important  Is  the  wav  lnform.it  ion  is  stv'ted  tor  retriev.il  and  eventual  use 
by  the  prc>sram:,.  Since  the  v.iilabloa  to  be  stored,  such  as  topojiraphv  or 
soli  type,  are  y.eoni .iph ical  Iv  oriented,  and  computer  stcirase  is  seiiueiit  l.il  Iv 
(or  at  least  iiuraer  leal  Iv)  oriented,  a svstera  must  exist  which  relates  the 
two.  Thus,  a locational  reterencln^  svstera  is  necessary.  It  a larne 
geO(;raphlc  .ire.r  were  subdivided  into  m.uiv  small  sub-areas,  each  sub-area 
could  be  more  det  liil  1 1 velv  idenlitled  than  the  orl);lnal  .irea,  because 
less  variation  would  be  likely  to  occur.  Further  reduction  In  sli"  ot  a 
•uh-area  would  turther  liicrea.se  the  possibility  of  unique  definition.  At 
Sc'«e  point,  say  one  .icre,  a speciiic  data  category,  such  as  "soli  tvpe" 
within  the  overall  v.iri.ible  "soil",  could  be  identified  as  that  soil  type 
most  representative  of  the  smaller  areal  unit  rep,ardlns  the  v.srlable  soli. 

If  many  of  these  areal  units  were  placed  side  bv  side  both  hor  Izoiit  a 1 Iv 
and  vertically,  each  slmtlarlv  repvi-sent  tim , say,  a discrete  soli  tvpe, 
a lar.qe  area  could  be  represented  resardins  the  varlabllitv  of  .soils. 

This  system  ot  contl>;uous  areal  units  to  represent  a variable,  such  as 
aoll  type.  Is  in  tact,  the  basis  I't  the  "llRlD"  iiitormatlon  system.  In 
fact,  "GRID"  takes  its  n.uae  trora  this  concept. 

A series  of  equally  spaced,  parallel  lines,  both  horizontal  .snd  verti- 
cal, is  overlaid  on  .t  speciiic  area,  dividing  it  into  a grid  of  equal 
alzed  squares  or  cells,  .is  shown  in  Fixture  A?.  By  Identifying  each  Rrld 
cell  (or  areal  unit)  with  a specitic  sub-cat e>iorv  of  a peoyraphlc  data 
variable,  the  entire  grid  then  abstractly  represents  the  collective 
distribution  of  any  data  varl.ible  over  an  area  identltled  for  study. 

The  use  of  such  .a  prld  cell  svstera  for  data  Ident  If  ic.it  Ion  Is  the 
algnlf  leant  t.ictor  llnkinit  geogtapliic  ltitorm.it  ion  io  machine  storage 
regarding  the  GRIO  svstem.  Additional  variables,  such  .is  slope,  veget.it  ion, 
etc.,  can  be  similarly  Identlticd  tor  the  s.ime  area  tor  a composite 
description.  Tills  composite  description  can  then  he  stored  as  a group 
of  single  variable  data  files  or  can  be  merged  into  .i  raul  1 1-v.ir  lab  le  data 
file. 

Since  the  grid  is  .i  set  ot  continuous  equ.il  sized  cells,  orderly  .iiiwf 
syatemat leal ly  atructured  due  to  horizontal  and  vertical  lines,  the  cells 
which  are  horlzoiit.il  Iv  side  hv  side  can  he  thought  ot  as  rows,  and  the 
cells  which  are  vertlc.illv  side  bv  side  can  be  thought  of  as  columns. 

Every  cell  In  the  grid  Is  then  a part  ot  some  row  and  some  column.  K.icli 
Tow/column  location  o.iii  he  m.itclied  with  the  data  varl.ible,  such  as  soil 
type,  assocl.ited  wltli  tlu»  loc.itiisi. 


100 


since  the  (trlJ  useil  tor  Uloiit  i t v I n*  s*fc'Kr»pt'lc'  Int  ors(.it  Ion  Is 
actudllv  a macrix.  It  c.in  be  t r. ins  forced . In  total.  Into  a numeric 
representation  ns  descitbed  on  the  piecedini;  i'.i>;e.  The  i nt o tra.« 1 1 on , onc«' 
translormed,  cun  then  he  sequent  I .»l  I v ted  to  the  computer  row  bv  row, 
as  a dltlgal  matrix  suitable  tor  computer  manlpulat Ions . Any  number  ot 
variables  tor  a chosen  area  of  study  can  he  thus  stored  on  a computer  for 
eventual  recall  tor  analysis  and/or  xraphic  i epr esent at  ion  of  a larne 
overall  Information  system. 

The  reader  should  keep  In  mind  that,  as  previously  mentioned,  associating 
row/colunm  local  ions  with  (leonraphlc  data  miv  he  accomplished  manually  or 
by  machine,  such  as  bv  usinq  ilKll’S  to  convert  I’lOS  data  Into  a qrld  cell 
format.  The  latter  method  is  most  ettlcient  and  Is  nornallv  used  hv  KSRl.  In 
addition,  besides  beinq  efticlent,  the  I’ lOS/ilKU’S/llKUl  sequence  provides 
data  usable  with  both  systems. 


Figure  47 

4 . 1 SINGLE  VARI  ABLE  FILE  GENERVnON_ j-_  Kl  UT^N  PROGRAM/REOGEN  FROGILAM 

When  geographic  data  v.irlables  are  manually  Ccnled  and  keypunched, 
versus  using  PIOS  and  GRIPS,  the  next  step  Is  to  generate  a computerlred 
data  file.  This  tile  generation  Is  accomplished  bv  one  ot  two  computer 

programs;  EILGEN  oi'  REGGEN.  Ell.Gl.N  Is  used  to  create  a file  from  d.ita  ■; 

encoded  using  a cell  bv  cell  eodlng  ecuivent  Ion  (often  useil  when  great  : 

variation  exists  for  the  data) . When  a significant  amount  ot  data  Is 


L 


101 


V'roKt^nis  al.'Ui-;  th«*v  must  li\  I'on \uiu  t ion  with  othv‘1 

with  mnulpii  I .>t  ive  o.ipah  1 1 1 1 to:!  toi  :.i'0!i!u  ta-.k-i,  I'suo  I'lio  or 
Bv>te  aln)(Io  v.iriahlo  Jata  Iiloa  liavo  ('••on  f;oiifi  a t o>l  uitti  thoso  I'rofjraros, 
thev  aro  st!>io.l  within  tho  >-i>m|nitoi  svstoTa  (a<  a mo’.-.bor  ot  sli\«tlo 
varlabli'  ilat.i  llloH,  oi  thi»Hi»;h  raoV|;iii',  as  one  or  raoio  an  1 1 l-vai  iab  lo 
data  flVi-a^  tor  looall  lo.l  o-o  bv  anv  ot  tb.e  vaitons  r.ianij'ii lai  I ve 
proitiainH  (vIKU**  ll,  SFARv'M,  oto.l.  boo.i!i:.i*  !'t  the*  ateirao.o  aapoot  ot 
such  tile  f;ruo  t at  Ions , a MIaI'.'N  or  K!'i\.;x  inoetra™  m.iv,  !>r  nav  not.  hr 
subalttitd  s lum  1 1 anov'na  I V with  a i im  who*eo  |>nn'oso  la  .lata  manipulation. 

4 • 2 ilRlP  U j;jWRA.*1 

"tlRin  ll"  i.a  a (jrld  oo  ll  oriouto.l  coniputoi  aaj'i'ln>t  (iroyttan  whloh 
ptovldr*  map  jtiaphica  I't  spatial  vnt  v'rij.i  t i on . It  is  an  oxoont  ton 
oriented  pvojti.xm  to  be  used  in  oonjuiutu-n  with  atoiod  sltiftle  variable 
or  mult  t-vai  lab  le  file  l.ita. 

ORin  11  has  been  desl>tn«*d  for  speed  and  o f 1 1 o leno  v ot  map  pto- 
ductUan.  It  nonaallv  uses  t h<*  Oe'apu  t ot  ' standatd  printer  beoau.so  of 
the  speed  and  cost  adv'ant  agi’s  over  pK>ttor  tvpe  uiap.s . llKll'  11,  with 
Its  hljth  speed  map  produrt  ion,  alliiws  the  user  instant  leedback.  tor 
envlronrjent.il  simulation  m^slellnx,  a decided  asset  in  env  I ronmental 
plaanlii); . 

The  GRIP  11  proitrara  instructs  the  computer  to  make  a map  or  maps, 
end  epecltles  the  precise  tons  ot  the  map  or  siaps  in  terms  ot  certain 
available  elective  treatments.  I'ne  piA'ftrara  can  produce  a set  ot  maps, 
but  each  map  runuires  its  own  set  ot  In.s  t rue  t tons . 

4.3  GRIP  >R'P_K>; 

GRID  Ml^PFl.  allows  the  usi*r  to  manipulate  one  or  more  variables  prior 
to  mapplntt.  Vhe  m.inlpulat  iv'ns  actual  Iv  lepresent  user  defined  wdels 
necessary  to  satistv  unlnue  .in.ilvsls  needs  (such  as  idont  I tv  inji  .soil 
erosion  potenti.iO,  and  m«v  eiicv'rapass  a sophist  icated  interaction  ot 
■ any  variables  which  include  additive  ,nid.\>r  mul  t t p 1 1 c.it  1 ve  weight  lii,cs  , 
etc.  When  several  varlable.s  .ire*  invi'lved,  the  .spprepriate  v.srlables 
associated  with  e.ich  cell  .in*  separately  tested.  As  these  appropriate 
variables  are  tested,  weiithted  valvies  ate  iletermiu«*d  dependin>t  on  tho 
user  apaclfled  test  enter  i.i.  A cumulative  weijthted  total  evolves  tor 
each  cell.  Tlie  ciimpeslte  I't  cells  which  torm  the  study  area  may  then 
be  napped  as  the  desired  iiKidel, 

While  such  modeling  cap.iMlltles  noimallv  require  special  programs 
to  be  written  bv  o'mputer  ptogrammeis,  GRIP  Ml'PKl.,  I Ike  I'Ol  VMOPKl.,  Is 
a type  ot  computer  language  which  w.ts  specttlcallv  created  tor  use  by 


102 


non-technlcal  users  and  actually  allows  the  users  to  "write"  (create) 
these  special  programs.  No  special  edur.ition,  materials,  etc.,  are 
necessa''y  other  than  the  GRID  II  program  and  its  documentation. 

To  provide  for  the  above  mentioned  mapping  and  modeling  capabilities, 

GRID  MODEL  incorporates  a nu.mber  of  specific  purpose  commands.  The 
program  is  flexible  and  the  user  is  able  to  specify  variations  of  the 
GRID  MODEL  commands  and  their  components.  Such  variations  allow  the 
user  to  control  the  mapping  or  analyses  desired. 

Each  com.Tsand  and  its  associated  data  is  input  on  a single  card. 

All  cards  have  the  same  format;  while  the  various  format  fields  per^ 
card  allow  different  data  inputs,  for  simplicity  the  same  data  input 
structure  applies  to  each  cormiand  card.  The  structure  incorporates: 
four  data  identification  fields;  three  data  manipulation  fields;  and 
one  text  field.  The  inputs  per  card  vary  according  to  the  command 
being  input  on  that  card  in  that  different  fields  will  be  used  for 
different  commands. 

GRID  MODEL  modeling  capabilities  vary  from  simple  to  complex  and 
are  nearly  infinite.  A simple  model  might  examine  a variable  to 
identify  a particular  characteristic.  For  instance,  a user  might 
wish  to  examine  a "soils"  variable  located  in  the  9th  position  on  a 
multi-variable  file,  to  identify  the  locations  of  one  particular 
type  of  soil,  such  as  "shallow  granitic".  If  the  variable  was 

composed  of  five  types  of  soil,  identified  numerically  by  the  numbers  |j 

1-5  ("2"  representing  shallow  granitic  soil),  four  IF  tests  would  be 
able  to  complete  the  analysis,  as  follows: 

IF  the  "soils"  cell  being  examined  contains  the  value  "0" 

(l.e.,  non-study  area)  input  -9999  in  the  Add  field  as  a 
"turnoff". 

IF  the  "soils"  cell  being  examined  contains  the  value  "1",  Input 
-9999  in  the  Add  field  as  a "turnoff". 

IF  Che  "soils"  cell  being  examined  contains  the  value  "2",  add 

"1". 

IF  the  "soils"  cell  being  examined  contains  the  integers  3 through 
5,  Input  -9999  in  the  Add  field  as  a "turnoff". 

The  above  logic  would  identify  the  shallow  granclc  soils  and  pass 
a "1"  to  the  output  file.  Ocher  soil  types  (and  original  non-study 
area  cells)  would  all  be  "turned  off"  and  passed  .as  non-study  area 
cells.  Subsequent  mapping  would  graphically  display  only  the  location 
of  shallow  granitic  soil. 

Of  course,  simply  mapping  the  entire  "soils"  variable  could 
graphically  identify  shallow  granitic  soil,  but  the  ocher  soil  types 
being  present  (or  being  placed  in  all  inclusive  high  and  low  level 
categories)  would  detract  from  the  distinct  separation  possible  if  only 
shallow  granitic  soil  was  mapped.  While  Che  user  can  see  the  many 
possible  ways  such  a need  could  be  satisfied,  if  the  "four  IF  test" 
approach  was  used,  the  GRID  MODEL  submission  might  look  as  follows: 


I 


T 


! 


E’ 

I 


ll 


4.4  GRID  SYSTEM  - MUI-Tl-VARIABLE  FILE  GENERATION  - GRID  MERGE  PROGRAM 

The  GRIDMERGE  progrrjii  is  used  to  merge  several  single  variable 
data  files  simultaneously  or  separately  into  a new  or  existing  multi- 
variable  data  file.  Normally,  additional  variables  are  added  to  an 
existing  multi-variable  file  called  an  "old"  or  "base"  file.  However, 
replacement  of  a variable  may  also  take  place.  While  replacement  of  any 
variable  is  possible,  sporadic  data  cannot  be  inserted  to  provide  update 
capabilities.  A variable,  or  variables,  to  be  added  or  merged  must 
represent  the  same  area  (or  data  file) . The  user  should  note  that  a 
multi-variable  data  file  is  a storage  concept,  as  opposed  to  simply 
a descriptive  "name."  Several  variables  are  not  necessary  for  a 
multi-variable  file  to  function.  One  variable  may  be  input,  stored, 
and  used,  and  other  variables  may  then  be  added  as  desired. 

The  multi-varlab le  file  created  by  the  GRIDMERGE  system  is  an 
overlay  concept  process,  similar  to  the  manual  overlay  of  plastic  maps 
which  can  be  used  to  describe  variations  in  geography  (l.e.,  soils, 
vegetation,  etc.).  The  plastic  overlay  maps,  however,  arc  replaced 
by  computerized  digital  grid  (matrix)  overlays.  An  example  of  this 
concept  was  shown  previously  as  figure  38. 

Grids  generated  by  other  programs  could  be  merged  and  stored  here, 
such  as  by  SEARCH  or  GRID  MODKI,.  Since  each  data  variable  grid  would 
represent  a designated  layer,  each  variable  can  be  accessed  by  specifying 
the  appropriate  layer.  Specifying  the  row  and  column.  In  addition  to 
the  variable  layer,  can  locate  a specific  grid  cell  within  the  designated 
layer. 


104 


4 . 5 C;R_[ SYSTKM  - W I NDOW  P RO('R.\M 

The  W[NtX)W  program  la  used  to  Identify,  tor  analysis,  n sub-area 
vtthln  a larger  stinly  area  for  whlrh  a rauj^t  i- variable  file  has  been 
created  and  already  exists.  This  capability  is  also  available  in 
RECCEN.  The  mul t i-var iab le  file  may  be  a product  ot  cither  the 
GRIDMKRCE  or  MAPMERGK  progr.im.  An  exmaple  ol  a simple  windowed  area 
la  shown  below. 


"WINDOW" 
STUDY  AREA 


EXISTING 
STUDY  AREA 


Depending  on  the  desired  analysis,  there  are.  two  possible  outputs 
associated  with  the  WINDOW  program,  as  follows: 

1,  A printout  of  the  data  values  stored  for  the  windowed  area  may 
be  specified. 

2.  A multi-variable  file  of  the  windowed  area  may  be  specified  to 
be  created  and  stored. 

4.6  GRID  SYSTEM  - CEI.L  UPDATE  PROGRrVM 


Cell  Update  Is  a program  which  allows  the  user  to  alter  data  in  a 
previously  created  multi-variable  file  and  output  a revised  multi- 
variable file.  The  alterations  may  Involve  changing  a file's  referencing 
system  to  coincide  with  an  alternate  system,  such  as  the  State  Plane 
system.  This  Is  for  display  purposes  only.  The  alterations  may  also 
Involve  changing  the  dat.i  file  on  a cell-bv-cell  basis,  with  one  or  more 
variables  being  changed  at  a time,  or  all  the  varl.\bles  for  a particular 
cell  being  deleted.  This  capability  is  also  available  in  RECGEN . There 
are  two  reasons  to  alter  the  data  varlable(s)  for  .inv  given  cell.  Flr.st, 
a user  may  wish  to  update  data  to  ret  lect  a current  :;talus  alter  .r 
change  has  occurred,  such  .ss  in  land  use  information.  Seci'nd,  a user 
stay  wish  to  correct  an  error,  such  as  an  Incorrect  data  v.alue  for  .a 
variable  due  to  misinterpretation,  clerical  error,  etc. 

4.7  GRID  SYSTEM  - SEARCH  PROGR,\M 


SEARCH  Is  a grid  cell  oriented  computer  program  which  provides  data 
manipulation  capabilities  for  geographically  disposed  Inform.itlon.  It 
is  a file  generation  program  to  be  used  in  ron)unctlon  with  stored  single 
and  mul  t l-var  l.tb  le  data  tiles.  Tlie  tile  generated  is  stored  wltliln  the 


105 


computer  system.  If  m.iitipul  at  loi\  ami  mapptu^  of  the  ^eimraiej  file  Is 
ditslreil,  a new  m.«p  park.irtf  run  must  he  liiltlaleil  to  .illi>w  the  tile  to 
bs  used  and 'mapped  (printed)  In  eon  |uiu't  Ion  with  the  CKID  proijram.  the 
SEARCH  program  provides  thiei'  h.iait  fapab  1 1 1 r I es  tor  the  analysis  ot 
gaographlc  data  stored  on  a ci>rapn  t er  Ized  CRlO  system; 


1.  KliUmnm  Plstanee  - The  SKARCii  program  analyzes  an  area  surround  Iny. 
each  cell  (the  radius  around  the  cell  Is  specltleil  hy  the  nsei) 

In  Che  grid  of  geographically  disposed  data.  It  then  Identities 
the  shortest  distance  from  the  cell  In  question  to  anv  other  lell 
within  the  specified  .irea  which  represents  a pre-determlned 
value  or  values,  called  a "match"  value.  As  mentioneil,  the 
radius  of  the  search  Is  specified  by  the  user  and  the  mlnlimim 
distances  to  the  cells  searched  for  must  be  within  this  radius. 

If  no  searched-for  values  are  found  within  the  radius  specified, 
a special  "beyond  sc.iii"  value  Is  assigned  to  the  cell  being 
tested.  The  mlitlmum  distances  thus  calculated,  or  the  default 
values,  then  become  values  associated  with  the  respective  grid 
cells  which  each  search  centers  around.  The  valvies  are  then 
stored  as  a single  variable  file  (can  be  merged  into  a multi- 
variable  file).  by  defining  value  range  parameters  associated 
with  desired  levels,  the  minimum  distance  variable  can  be  mapped 
for  user  analysis. 


2.  Number  of  Occurrences  Unwe  I j^hted  - The  Sf'ARl'H  program  can  also 

be  used  to  compvite  the  number  ot  times  such  a pi  e-det  e m' ued 
"match  value”  occurs  within  the  distance  specltled.  In  this 
Instance,  the  niunber  of  occurrences  relating  ti>  each  cell  In  the 
grid  are  stored  as  a single  variable  tile  (can  be  meiged  Into  a 
multi-variable  file)  for  user  analysis.  It  should  be  noted 
that  cells  along  the  outer  perimeter  of  the  stndv  area  will  have 
a reduced  search  area.  I'hls  la  because  cells  outside  the  stiuly 
area  will  not  be  searched,  whether  within  the  seaich  i.idios  ot 
a given  cell  within  the  study  area  or  not.  The  reduced  search 
area  alao  applies  to  cells  whose  distance  to  the  ('utei  perimeter  Is 
less  than  the  specified  search  radius.  This  reduced  search  are.i 
phenomenon  along  the  edges  of  p.srrlcular  study  areas  is  an 
important  concept . As  an  example.  It  an  arch4eoh>gy  site  grid 
were  used,  the  number  ot  occurrences  of  arch.ieol  o.g  i c sites  relating 
to  each  grid  cell  might  he  used  In  con|unctlon  with  other 
geographic  varlable.s,  such  as  slope  and/or  soils,  tor  loc.rtlonal 
analysis  regarding  residential  uses.  Such  local  loii  analyses 
might  be  conducted  tor  an  .irea  whose  boundary  was  coincidental 
with  a polltlc.il  hound. irv,  such  as  a countv.  If  the  lespeitlve 
county  was  conducting  the  an.ilyses,  .ii  ch.ieologlc  sites  outside 
the  county  would  prob.ahlv  not  he  pertinent  to  the  ettort,  and 
therefore  would  not  iie<‘d  to  lu'  smirched.  In  thl.s  study,  it 
would  only  be  necess.irv  to  know  the  number  of  occurtences  of 
srchaeologlc  sites  within  the  study  area.  Such  may  not  .ilways 
be  the  case. 


106 


-r 


3.  Number  of  Occurrencea  ^Wel^ted  - In  certain  caueH,  the  number 
of  occurrences  ot  a ^tveu  searchoil  for  vahie  may  need  to  Include 
occurrences  outside  a specific  study  area.  An  example  may  be 
water  sources  relative  to  a possible  residential  development.  A 
number  of  occurrences  search,  with  wei>;ht  ui>i,  provides  this  cap.ibillty 
In  such  cases,  cells  outside  the  study  area  will  not  be  searched, 
but  occurrences  outside  the  study  area  are  approximated  throuKh 
such  weighting.  For  the  given  study  area,  the  process  involves 
Identifying  for  a specified  search  radius,  a number  of  cells 
which  will  normally  be  searcheii.  I'here  will  be  a number  ot 
"match  value"  occurrences  within  tl  is  number  of  cells.  The 
specified  area  around  each  cell  In  the  grid  matrix  is  searched 
In  this  manner.  When  a cell  near  or  along  the  perimeter  of  the 
study  area  Is  searched,  a number  of  C(!lls  less  than  normal  Is 
searched.  The  reduced  number  of  searched  cells  Is  compared  to 
the  normal  number  of  searched  cells  .and  a coefficient  is  produced. 

This  coefficient  is  applied  to  the  number  ot  match  values 
resulting  from  the  reduced  search  area.  Tlvis  produces  a number 
of  occurrences  approximation  wlilcli  might  have  occurred,  had  a 
normal  search  taken  place,  where  weighting  near  the  edge  of  the 
study  area  was  not  required. 

4 . 8 GRID  SYSTEM  - CRtDPLOr 

GRIDPLOT  Is  a computer  program  which  provides  graphics  output.  GRID- 
PLOT  graphics  differs  from  GRID  printer  graphics  In  that,  instead  of  a 
character  printer  being  used  to  provide  output  dl.splay  capabilities,  a 
pen  plotter  Is  used  to  draw  line  representations  of  dala.  Grey  tone 
drawings  (shadings)  represent  the  data,  similar  to  the  grey  toning 
supplied  by  printer  character  overprinting.  However,  the  plotter  grey 
cones  which  represent  the  various  levels  present  much  sharper  shading 
contrasts  regarding  area  representations.  Part  of  this  definitive  area 
separation  Is  due  Co  the  optional  capability  to  outline  the  described 
areas. 

GRIDPLOT  Is  an  output  program  which  is  used  in  close  conjunction 
with  Che  GRID  program.  The  computerized  cross-referencing  system  of 
GRIDPLOT  which  controls  the  pen  plotter  was  developed  specifically  for 
use  with  the  GRID  based  cross  referencing  system. 

4.9  GRID  SYSTEM  - ELECTROSTATIC  GRIDPt.OT 

ELECTROSTATIC  GRIDPLOT  is  a computer  program  which  provides  graphics 
output.  It  Is  used  with  data  that  has  been  automated  according  to  the 
GRID  cross-referencing  system.  The  ELECTRO  STATIC  CRlDl’l.OT  graphics 
differs  from  GRID  and  GRIDPLOT  graphics  in  that.  Instead  of  a character 
printer  and  a pen  plotter  respectively  being  used  to  provide  output 
display  capabilities,  an  electrostatic  plotter  la  used  to  output  tonal 
representations  of  data.  The  tonal  representations  of  the  data  are 
similar  to  the  grey  toning  supplied  by  printer  character  overprinting 
and  pen  plotter  drawings. 

As  previously  mentioned,  plotter  representations  present  much 
sharper  shading  contrasts  regarding  displayed  areas.  Parc  of  this 


107 


definitlvi'  virt*a  -ieparatioii  Is  iliif  to  tlie  optional  capability  to  outline 
the  describcil  areas.  While  this  Is  true  tor  both  plotter  techniques, 
electrostatic  plotters  are  much  faster  and  more  cost  effertlve.  This  is 
because  an  e lecc ros t at ic  plotter  allows  plotter  type  visual  reproductions 
spontaneously  on  a line  by  line  basis  (similar  to  a line  printer)  as 
opposed  to  the  historically  slow  pen  plotter  delineations. 

4.10  COLORMAP 

COLORMAl’  is  a computer  program  which  provides  color  graphics  output 
of  GRID  and  CRIPS  generated  data  files.  COLORMilP  graphics  differs  from 
GRID  printer  and  plotter  graphics  in  that,  instead  of  a character 
printer  or  plotter  being  used  to  provide  output  display  capabilities, 
a DICOMED  digital  image  recorder  is  used  to  generate  color  graphic 
representations  of  grid  cell  oriented  data.  Color  hues  and  tones 
represent  areas  of  homogenous  data,  similar  to  the  grey  toning  supplied 
by  printer  character  overprinting  or  plotter  drawings.  However,  because 
of  color,  the  image  recorder  representation  presents  much  sharper 
contrasts  regarding  area  representations.  This  definitive  area 
separation  can  be  enhanced  by  the  optional  capability  to  outline  the 
described  areas . 

4.11  AREA  CALCULATIONS 

The  AREA  CALCULATION  program  is  an  output  oriented  program  of  the 
GRID  system.  The  main  purpose  of  the  program  is  to  quantify,  in  areal 
units  (l.e.,  acres,  square  feet,  square  meters,  square  miles,  etc.)  Che 
sub-categories  of  one  variable  which,  when  another  variable  is  .super- 
imposed on  It,  fall  within  the  sub-category  boundary  definitions  of  the 
other  variable. 

For  Instance,  the  AREA  CiM.CUL-VTION  program  would  allow  a user  to 
susmarlze  the  various  land  uses  by  acres,  square  miles,  etc.,  which  occur 
within  census  tract  boundaries,  planning  district  boundaries,  etc.,  or 
different  soil  types  as  they  occur  within  floodplains,  agricultural 
preserves,  planning  districts,  etc.  If  land  uses  were  to  be  summarized 
relative  to  census  tract  boundaries,  all  census  tracts  In  the  variable 
(or  specific  census  tracts  selected  by  the  user)  would  be  examined,  one 
at  a time,  and  the  land  uses  within  Its  boundary  would  be  identified. 

The  Identified  land  uses  for  each  census  tract  would  then  be  aggregated 
by  land  use  category  and  summarized  in  the  appropriate  user  specified 
areal  units. 


1 


lOR 


I 


VARIABLE  GRID  RESOLUTION 
HYDROLOGIC  ENGINEERING  CENTER 

DISCUSSION 


Q.  As  d side  issue,  can  you  relate  how  the  terrain  unit  analysis  approach 
your  organization  uses  in  many  of  its  studies  compares  to  the  tri- 
angle approach  of  W.E.  Gates  and  Associates? 

A.  In  both  the  triangle  approach  as  well  as  the  integrated  terrain  unit 
approach  there  is  an  attempt  to  classify  variation  of  geography 
according  to  its  lowest  common  denominator  within  some  generalized 
classification  system  (i.e.,  soils  breakdown,  slope  classes,  etc.). 

The  similarity  essentially  stops  at  this  conceptual  level.  It's 
important  to  discuss  within  this  context  the  differences.  First, 
the  integrated  terrain  unit  approach  envisions  the  integration  of 
natural  factors  separating  for  another  type  of  display,  cultural  fea- 
tures, including  infrastructure  and  coverages  such  as  land  use.  The 
philosophical  basis  for  terrain  unit  mapping  is  that  landscape  can  be 
classified  into  relatively  homogeneous  morphological  units  which  have 
varying  degrees  of  predictability.  This  is  not  necessarily  influenced 
although  quite  by  chance  it  often  occurs  that  cultural  factors  fall 
into  some  of  these  patternings  (i.e.,  ownership,  linear  cultural 
developments,  land  use,  etc.).  Second,  terrain  unit  attempt  to  inte- 
grate the  various  classified  factors  along  common  boundaries  which 
are  highly  irregular  in  shape.  This  is  markedly  different  from  the 
triangle  unit  which  is  at  best  a gross  abstraction  of  morphological 
boundaries. 

It's  important  to  focus  on  the  uses  of  triangles  vs.  the  uses  of 
polygonal  terrain  units.  In  the  case  of  the  triangle  structure  the 
basic  unit  application  developed  by  W.E.  Gates  and  Associates  was  for 
manipulating  information  in  an  aggregated  format  such  that  area  data 
originating  from  aerial  unit  definitions  could  be  handled  at  a water- 
shed or  sub-watershed  unit  scale.  The  terrain  unit  analysis  is  pri- 
marily structured  for  mapping  as  well  as  overlay  and  area  summaries 
within  units  of  larger  size.  Graphics  are  a very  important  component 
therefore,  of  the  terrain  unit  analysis  and  this  stress  on  accuracy 
of  the  boundary  definitions  is  important.  It  is  particularly  relevant 
when  one  considers  that  these  maps  are  often  used  in  terms  of  defining 
lines  for  environmental  management  which  actually  are  used  in  the 
field.  My  understanding  of  the  triangles  are  that  although  maps  are 
made  from  them  these  maps  are  not  meant  for  management  of  the  environ- 
ment on  the  ground  but  rather  the  presentation  of  abstractions  to  be 
used  in  general  analysis  or  communication  of  such  patternings  of 


lOd 


geography.  Finally,  the  terrain  unit  analysis  focuses  on  the 
classification  of  qualitative  information  and  does  not  attempt 
to  consider  the  third  dimension  {i.e.,  topographic  elevation). 


Q.  Can  you  compare  or  rank  the  grid  cell,  triangles,  and  terrain  unit 
approach  as  to  their  desirability  for  the  types  of  analysis  the  HCC 
does  in  its  studies? 

A.  It  is  almost  impossible  to  rank  grid  cell  vs.  triangles  vs.  terrain 
units  as  being  approaches  which  could  be  weighted  according  to  desira- 
bility by  HEC.  They  are  clearly  different  approaches  and  in  one  sense 
are  designed  to  handle  different  things.  Triangles  are  a form  of 
abstraction.  They  work  very  well  for  handling  topographic  data  and 
are  much  more  generalized  for  handling  polygonal  data  such  as  vegeta- 
tion, soil,  geologic  structures,  etc.  The  grid  cell,  like  triangles, 
can  handle  both  elevation  as  well  as  polygonal  information,  however. 

It  defines  a three-dimensional  surface  less  accurately  and  perhaps 
what  one  might  call  differently  than  the  triangular  approach.  It  is 
not  as  useful  when  considering  a variable  sized  grid  or  unit  of  aggre- 
gation. On  the  other  hand  it  is  regular  and  if  we're  considering 
certain  types  of  polygonal  data  it  has  many  advantages.  The  terrain 
unit  is  qualitatively  a different  type  of  unit  for  representation  of 
geographic  data.  It  does  not  handle  three-dimensional  information  yet 
handles  the  qualitative  information,  soils,  geology,  vegetation,  etc., 
in  a way  which  is  better  than  parametric  mapping  because  it  resolves 
many  of  the  boundary  variations  resulting  from  inaccuracies  in  original 
map  data.  For  the  polygonal  maps,  infonnation  is  represented  more 
accurately  (i.e.,  less  abstract)  than  a triangle  or  grid  cell.  For 
graphics  it  is  superior.  Triangles  on  the  other  hand,  because  of  their 
multiple  use  capabilities,  offer  some  unique  possibilities  particularly 
in  the  terrain  areas. 


Q.  Your  organization  has  worked  with  both  multivariable  grid  file  and 
polygon  overlay  systems.  Since  none  of  the  seminar  participants  has 
suggested  a total  polygon  overlay  system,  can  you  explain  why  FSRl 
did  not  make  it  a recommendation? 

A.  Polygon  overlay  systems  are  not  a feasible  alternative  at  this  time 
due  to  the  cost  associated  with  these  systems.  However,  new  programs 
are  under  development  which  may  result  in  more  efficient  solutions 
to  the  polygon  overlay  problem.  A second  aspect  to  polygon  overlay 
which  does  not  deal  with  cost  is  the  complexity  ot  resulting  maps  that 
occur  when  polygons  are  overlaid.  A resolution  of  problems  which 
result  from  the  overlay  of  lines  which  in  fact  are  the  same  line  repre- 
sented in  two  separate  variables  is  a major  te?chnical  problem.  The 
essence  of  this  problem  is  that  when  these  polygons  are  overlaid  the 
results  may  be  the  creation  of  many  very  small  polygon  structures  which 
are  not  meaningful. 


iin 


The  run-length  storage  of  multivariable  files  seems  to  be  desirable 
as  far  as  reducing  required  storage  device  size  and  the  costs  associ- 
ated with  large  disc  files.  However,  there  currently  is  no  conmuni- 
cation  module  available  to  have  existing  programs  access  or  create 
this  new  file  structure. 

It  is  true  that  considerable  savings  would  occur  both  in  processing 
and  also  storage  if  the  multivariable  run  length  files  could  be 
created.  If  uses  are  to  be  made  of  this  new  file  structure,  existing 
programs  will  have  to  have  adjustments  made  to  utilize  the  data.  The 
extent  of  these  adjustments  is  not  known  at  this  time,  however,  it  is 
expected  that  rather  than  restructuring  all  the  programs  an  interface 
module  would  be  developed  which  would  convert  files  from  one  format  to 
the  other.  The  implication  of  this  is  that  there  is  a strong  theoreti- 
cal process  for  this  type  of  system  and  cost  savings  particularly  in 
large  amounts  of  processing  would  no  doubt  result  in  a major  benefit 
many  times  greater  than  the  initial  investment  in  conversion  of  the 
program.  More  specifically,  much  of  the  cost  associated  with  existing 
programs  is  in  reading  each  cell.  Savings  should  be  made  in  reading 
only  where  there  is  a change  in  the  data  variables  when  you're  analyz- 
ing and  then  looping  through  the  calculations  for  how  many  cells  are 
the  same. 


VARIABLE  GRID  RESOLUTION 
HYDROLOGIC  ENGINEERING  CENTER 


AUTHOR'S  ADDITIONAL  COMMENTS 


! 


s 


f 

r 

» 

r 


1,  In  order  to  attain  a perspective  on  digitizinq  techniques  it  is 

proper  to  review  two  definitions  related  to  the  subject:  a)  map  and 
geographic  features  placed  on  them,  and  b)  computer  methods  for  re- 
cording them. 

There  are  four  primary  categories  of  map  elements  that  are  used  to 
represent  geographic  variation  (i.e.,  points,  lines,  polygons  and 
surfaces) . 

In  their  map  form  these  elements  are  represented  relative  to  other 
elements  on  the  map  and  in  space  through  a system  of  measurements 
commonly  known  as  coordinates.  These  coordinates  are  of  varying  types 
and  possess  variable  accuracies  (i.e.,  from  roughly  sketched  maps  to 
highly  accurate  cartographic  displays  and  from  graphically  defined 
visual  relationship  coordinates  to  world  based  projection  systems  such 
as  latitude,  longitude,  UTM,  etc.). 

Most  of  the  experiences  with  geographic  infoniiation  systems  suggests 
that  handling  data  in  a "least  common  denominator"  form  is  "the  way  to 
go".  This  is  justified  because  you  can  always  aggregate  smaller  units 
into  larger  units  but  you  cannot  disaggregate  the  minimum  size  unit. 

Also  this  concept  of  data  is  conceptually  most  beneficial  to  the  maxi- 
mum number  of  applications,  because,  of  course,  you  can  build  a structure 
which  is  most  useful  for  a given  application. 

Cells,  triangles  and  x,y  coordinates  expressed  as  polygons  represent 
alternative  ways  that  geographic  data  can  be  abstracted  for  the  com- 
puter. All  of  these  techniques  are  ways  to  spatially  identify  geo- 
graphic variations. 

Grid  and  triangle  data  structures  build  in  certain  inflexibilities  and 
inaccuracies  that  may  not  be  acceptable  for  certain  appi ications. 

From  my  point  of  view,  x,,v  coordinates  continue  to  offer  the  greatest 
flexibility  for  conversion  to  all  other  data  "cells".  This  is  a very 
important  factor  because  the  investment  in  data  files  by  the  Corps  is 
and  will  be  great.  The  files  should  be  in  a form  that  can  be  used  by 
other  agencies  for  other  applications.  To  do  this  the  data  can  later 
be  converted  to  triangles,  grids,  or  any  other  form. 


112 


T 


Specifically,  one  can  build  from  x,y  coordinates  of  points,  lines 
and  polygons  which  represent  surfaces  (e.g.,  contours),  any  aerial 
unit  or  "cell"*  which  would  be  useful  for  a given  application  (i.e., 
grids,  triangles  or  any  other  polygon  shape).** 

The  ultimate  recording  of  data  in  the  lowest  comnwn  denominator  is  a 
useful  concept  which  should  ultimately  become  the  focus  of  HEC,  USACOE 
and  institutions  involved  in  this  avenue  of  application.  However,  in 
the  short  term  this  may  not  be  practical  for  many  reasons  (i.e.,  exist- 
ing software  and  procedures,  administrative  and  institutional  arrange- 
ments, hardware  availability,  manpower  problems,  etc.). 

The  system  for  encoding  at  HEC  must  remain  one  which  is  simple,  con- 
sistent, and  workable  in  a field  office  unless  there  is  a dramatic 
reorganization  of  who  does  the  data  preparation  later. 

Several  short  term  recommendations  should  be  considered.  These  are 
outlined  below: 

(a)  Compressed  data  structures  using  multiple  compressed  run- 
encoding file  structures  should  be  incorporated. 

(b)  Investigation  (in  pilot  studies)  of  the  feasibility  of  using 
triangles  for  both  conversion  of  topo  to  grid  as  well  as  a 
complete  conversion  from  grid  to  a triangular  format  for  data 
processing. 

In  longer  terms  there  is  considerable  justification  for  examination 
of  one  or  more  centralized  centers  for  data  preparation  in  a format 
which  is  consistent  and  in  the  lowest  common  denominator.  This  has 
implications  for  hardware  acquisition  staff  organization  and  institu- 
tional arrangements.  As  part  of  long-range  expansion  considerations 
federal  agencies  (i.e.,  USGS,  Census,  SCS)  should  be  stressed  regard- 
less of  the  ultimate  manner  in  which  the  files  are  created. 

2.  Additional  computations  indicate  that  there  is  considerable  savings 
in  file  storage  if  the  compressing  scheme  is  employed.  They  are 
outlined  as  follows: 

The  Northern  Venezuela  data  vase  of  31  variables  including 
topographic  elevation  and  aspect  showed  a storage  savings  of 


*Cell  is  defined  here  to  mean  any  spatial  data  unit  and  is  not  limited  to 
a "grid"  cell. 

**I  am  assuming  that  the  unique  ability  of  triangles  to  represent  terrain 
can  be  generated  by  computer  from  x,y  coordinate  file  of  terrain  and 
ridge  and  course  lines. 


113 





I 

1Z%  if  compressed  multivariable  files  were  used.  If  slope  ; 

and  aspect  had  not  been  included  in  the  multivariable  data 
file  but  stored  as  separate  files  the  storage  savings  would 

have  been  80v  on  the  multivariable  file,  but  the  storage  of  i 

separate  files  for  these  variables  would  cut  the  actual  savings 

to  only  76%  for  a 4%  relative  reduction.  If  all  data  items 

varied  greatly  then  these  compression  approaches  would  not  be 

worthwhile.  The  processing  time  for  compression  and  expansion 

is  not  significant  and  should  be  easily  offset  by  the  reduced  , 

time  accessing  data  from  the  file. 


I 

1 


Seminar  Participants 


y 


Bill  S.  Elchert  Director,  the  Hydroloqlc  Engineering 

Center,  Davis,  California 

Darryl  W.  Davis  Chief,  Planning  Analysis  Branch, 

the  Hydrologic  Engineering  Center, 

Davis,  California 

Arlen  D.  Fel(tnan  Chief,  Research  Branch,  the  Hydrologic 

Engineering  Center,  Davis,  California 

R.  Pat  Webb  Seminar  Coordinator,  Planning  Analysis 

Branch,  the  Hydrologic  Engineering 
Center,  Davis,  California 

Kenneth  J.  Dueker  Director,  Institute  of  Urban  and 

Regional  Research;  Professor,  University 
of  Iowa,  Iowa  City,  Iowa 

Robert  H.  Ericksen  Doctorate  Candidate,  Department  of 

Geography,  University  of  Iowa,  Iowa  City, 
Iowa 

Richard  M.  Males  Vice  President,  W.  E.  Gates  & Associates, 

Batavia,  Ohio 

Jack  Dangermond  Director,  Environmental  Systems  Research 

Institute,  Redlands,  California 

Raymond  J.  Postma  Programmer  Analyst,  Environmental 

Systems  Research  Institute,  Redlands, 
California 


115 


rr 

> 


UNCLASS  I FI ^ 

security  classification  of  this  page  (Wh0n  Dmtm  Entmfii) 


REPORT  DOCUMENTATION  PAGE 

READ  INSTRUCTIONS 

BEFORE  COMPLETING  FORM 

1.  REPORT  number 

IZ.  GOVT  ACCESSION  NO. 



3 RECIPIENT’S  CATALOG  NUMBER 

4 title  ('•nd  Sub(lrl*J  ' 

Proceedings  of  a Seminar  on  Variable  Grid 
Resolution  - Issues  and  Requirements 

5.  TYPE  OF  REPORT  A PERIOD  COVERED 

Seminar  proceedings^ 

1 6 PERFORMING  ORG.  REPORT  NUMBER 

1 

7 AuTHORr®; 

8 CONTRACT  OR  GRANT  NUMSER<^*; 

9 PERFORMING  ORGANIZATION  NAME  AND  ADDRESS 

The  Hydrologic  Engineering  Center 

609  Second  St.  / 

Davis,  California  95616 

to,  PROGRAM  ELEMENT,  project.  TASK 
AREA  A WORK  UNIT  NUMBERS 

M.  CONTROLLIILG  office  name  and  ADDRESS 

Hydrologic  Engineering  Center 

US  Army  Corps  of  Engineers 

609  Second  St.,  Davis,  CA  95616 

12.  report  date 

Auqust  1977 

13.  NUMBER  OF  pages 

115 

l<  MONITORING  agency  NAME  A AOORESSCII  dilferenl  Irom  Controlling  Ollice) 

15.  security  CLASS,  fo/  rhia  roporO 

unclassified 

15a.  DECLASSIFICATION 'DOWNGRADING 

schedule 

16  DISTRIBUTION  STATEMENT  (oi  rh/«  ReporO 


Approved  for  Public  Release  - Distribution  UnTimited 


17.  distribution  statement  (oi  tho  4ib«ff«cf  enf«f*d  In  Block  20,  i(  didorent  from  Rmport) 


18-  SUPPLEMENTARY  NOTES 


t9.  KEY  WORDS  fConiinuo  on  reverae  aid*  If  nec*#a*ry  and  idontHy  by  block  number) 

Geographic  Information  systems 
Grid  cell  data  banks  Triangle  overlay 

I Grid  cell  Digital  terrain  model 

! Polygon  overlay  j 

ABTTRTc[^7conl/nu«*ofr7«v*r##'Tid*l7necea**fyandld*ri(i7yby6Iocknum6^J  y/W  j 

Any on^crea ting  a geographic  information  system  (GIS)  4g  confronted  with  the 
problem^bf  needing  more  detail  in  certain  parts  of  the  study  area  and  less 
detail  in  other  parts.  This  problem  may  be  handled  in  the  classification, 
encoding  or  data  structure  stages  of  creating  a GIS.  The  seminar  participants 
worked  on  this  problem  primarily  in  terms  of  data  structure  such  as  triangles, 
polygon  overl§fi|y  to  a variable  grid  and  variable  grid  by  a packed  run- length 
data  storage  scheme.  ^ 


DO  1473  edition  OF  I MOV  6S,S  OBSOLETE  UNCLASSIFIED 

SECURITY  CLASSIFICATION  OF  THIS  PAGE  flWien  Dmtm  Enimrmd) 


